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Preface to the Mathematical Background 



We want you to reason with mathematics. We are not trying to get everyone to give 
formalized proofs in the sense of contemporary mathematics; ‘proof’ in this course means 
‘convincing argument.’ We expect you to use correct reasoning and to give careful expla- 
nations. The projects bring out these issues in the way we find best for most students, 
but the pure mathematical questions also interest some students. This book of mathemat- 
ical “background” shows how to fill in the mathematical details of the main topics from 
the course. These proofs are completely rigorous in the sense of modern mathematics - 
technically bulletproof. We wrote this book of foundations in part to provide a convenient 
reference for a student who might like to see the “theorem - proof” approach to calculus. 

We also wrote it for the interested instructor. In re-thinking the presentation of beginning 
calculus, we found that a simpler basis for the theory was both possible and desirable. The 
pointwise approach most books give to the theory of derivatives spoils the subject. Clear 
simple arguments like the proof of the Fundamental Theorem at the start of Chapter 5 below 
are not possible in that approach. The result of the pointwise approach is that instructors 
feel they have to either be dishonest with students or disclaim good intuitive approximations. 
This is sad because it makes a clear subject seem obscure. It is also unnecessary - by and 
large, the intuitive ideas work provided your notion of derivative is strong enough. This 
book shows how to bridge the gap between intuition and technical rigor. 

A function with a positive derivative ought to be increasing. After all, the slope is 
positive and the graph is supposed to look like an increasing straight line. How could the 
function NOT be increasing? Pointwise derivatives make this bizarre thing possible - a 
positive “derivative” of a non-increasing function. Our conclusion is simple. That definition 
is WRONG in the sense that it does NOT support the intended idea. 

You might agree that the counterintuitive consequences of pointwise derivatives are un- 
fortunate, but are concerned that the traditional approach is more “general.” Part of the 
point of this book is to show students and instructors that nothing of interest is lost and a 
great deal is gained in the straightforward nature of the proofs based on “uniform” deriva- 
tives. It actually is not possible to give a formula that is pointwise differentiable and not 
uniformly differentiable. The pieced together pointwise counterexamples seem contrived 
and out-of-place in a course where students are learning valuable new rules. It is a theorem 
that derivatives computed by rules are automatically continuous where defined. We want 
the course development to emphasize good intuition and positive results. This background 
shows that the approach is sound. 

This book also shows how the pathologies arise in the traditional approach - we left 
pointwise pathology out of the main text, but present it here for the curious and for com- 
parison. Perhaps only math majors ever need to know about these sorts of examples, but 
they are fun in a negative sort of way. 

This book also has several theoretical topics that are hard to find in the literature. It 
includes a complete self-contained treatment of Robinson’s modern theory of infinitesimals, 
first discovered in 1961. Our simple treatment is due to H. Jerome Keisler from the 1970’s. 
Keisler’s elementary calculus using infinitesimals is sadly out of print. It used pointwise 
derivatives, but had many novel ideas, including the first modern use of a microscope to 
describe the derivative. (The I’Hospital/Bernoulli calculus text of 1696 said curves consist 
of infinitesimal straight segments, but I do not know if that was associated with a magni- 
fying transformation.) Infinitesimals give us a very simple way to understand the uniform 




derivatives, although this can also be clearly understood using function limits as in the text 
by Lax, et al, from the 1970s. Modern graphical computing can also help us “see” graphs 
converge as stressed in our main materials and in the interesting Uhl, Porta, Davis, Calculus 
& Mathematica text. 

Almost all the theorems in this book are well-known old results of a carefully studied 
subject. The well-known ones are more important than the few novel aspects of the book. 
However, some details like the converse of Taylor’s theorem - both continuous and discrete - 
are not so easy to find in traditional calculus sources. The microscope theorem for differential 
equations does not appear in the literature as far as we know, though it is similar to research 
work of Francine and Marc Diener from the 1980s. 

We conclude the book with convergence results for Fourier series. While there is nothing 
novel in our approach, these results have been lost from contemporary calculus and deserve 
to be part of it. Our development follows Courant’s calculus of the 1930s giving wonderful 
results of Dirichlet’s era in the 1830s that clearly settle some of the convergence mysteries 
of Euler from the 1730s. This theory and our development throughout is usually easy to 
apply. “Clean” theory should be the servant of intuition - building on it and making it 
stronger and clearer. 

There is more that is novel about this “book.” It is free and it is not a “book” since it is 
not printed. Thanks to small marginal cost, our publisher agreed to include this electronic 
text on CD at no extra cost. We also plan to distribute it over the world wide web. We 
hope our fresh look at the foundations of calculus will stimulate your interest. Decide for 
yourself what’s the best way to understand this wonderful subject. Give your own proofs. 
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CHAPTER I 

Numbers 



This chapter gives the algebraic laws of the number systems used 
in calculus. 




Numbers represent various idealized measurements. Positive integers may count items, 
fractions may represent a part of an item or a distance that is part of a fixed unit. Distance 
measurements go beyond rational numbers as soon as we consider the hypotenuse of a right 
triangle or the circumference of a circle. This extension is already in the realm of imagined 
“perfect” measurements because it corresponds to a perfectly straight-sided triangle with 
perfect right angle, or a perfectly round circle. Actual real measurements are always rational 
and have some error or uncertainty. 

The various “imaginary” aspects of numbers are very useful fictions. The rules of com- 
putation with perfect numbers are much simpler than with the error-containing real mea- 
surements. This simplicity makes fundamental ideas clearer. 

Hyperreal numbers have ‘teeny tiny numbers’ that will simplify approximation estimates. 
Direct computations with the ideal numbers produce symbolic approximations equivalent 
to the function limits needed in differentiation theory (that the rules of Theorem 1.12 give 
a direct way to compute.) Limit theory does not give the answer, but only a way to justify 
it once you have found it. 



1.1 Field Axioms 



The laws of algebra follow from the field axioms. This means that algebra 
is the same with Dedekind’s “real” numbers, the complex numbers, and 
Robinson’s “hyperreal” numbers. 
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1. Numbers 



Axiom 1.1. Field Axioms 

A “field” of numbers is any set of objects together with two operations, addition 
and multiplication where the operations satisfy: 

• The commutative laws of addition and multiplication, 

ai + 02 = 02 + Oi & ai • 02 = 02 • Oi 

• The associative laws of addition and multiplication, 

Ol + (o2 + 03) = (oi + 02) + 03 & oi • (02 • 03) = (oi • 02) • 03 

• The distributive law of multiplication over addition, 

Ol • (02 + O3) = Ol • O2 + Ol • 03 

• There is an additive identity, 0 , with 0 + o = o for every number a. 

• There is an multiplicative identity, 1 , with 1 ■ a = a for every number o 7^ 0 . 

• Each number a has an additive inverse, —a, with 0+ (—0) = 0 . 

• Each nonzero number a has a multiplicative inverse, with o • ^ = 1 . 

A computation needed in calculus is 
Example 1.1. The Cube of a Binomial 



(x + Ax)^ = + 3x^Ax + 3xAx^ + Ax^ 

= x^ + 3x^Ax + {Ax{3x + Ax)) Ax 

We analyze the term s = (Aa;( 3 a; + Ax)) in differentiation. 

The reader could laboriously demonstrate that only the field axioms are needed to perform 
the computation. This means it holds for rational, real, complex, or hyperreal numbers. 
Here is a start. Associativity is needed so that the cube is well defined, or does not depend 
on the order we multiply. We use this in the next computation, then use the distributive 
property, the commutativity and the distributive property again, and so on. 

{x + Ax)^ = {x + Ax){x + Ax)(x + Ax) 

= (x + Ax)((x + Ax)(x + Ax)) 

= (x + Ax)((x + Ax)x + (x + Ax)Ax) 

= (x + Ax)((x^ + xAx) + (xAx + Ax^)) 

= (x + Ax)(x^ + xAx + xAx + Ax^) 

= (x + Ax)(x^ + 2xAx + Ax^) 

= (x + Ax)x^ + (x + Ax) 2 xAx + (x + Ax)Ax^) 



The natural counting numbers 1 , 2 , 3 ,... have operations of addition and multiplication, 
but do not satisfy all the properties needed to be a field. Addition and multiplication do 
satisfy the commutative, associative, and distributive laws, but there is no additive inverse 
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0 in the counting numbers. In ancient times, it was controversial to add this element that 
could stand for counting nothing, but it is a useful fiction in many kinds of computations. 

The negative integers —1, —2, —3, . . . are another idealization added to the natural num- 
bers that make additive inverses possible - they are just new numbers with the needed 
property. Negative integers have perfectly concrete interpretations such as measurements 
to the left, rather than the right, or amounts owed rather than earned. 

The set of all integers; positive, negative, and zero, still do not form a field because there 
are no multiplicative inverses. Fractions, ±1/2, ±1/3, . . . are the needed additional inverses. 
When they are combined with the integers through addition, we have the set of all rational 
numbers of the form ±p /<7 for natural numbers p and q ^ 0. The rational numbers are a 
field, that is, they satisfy all the axioms above. In ancient times, rationals were sometimes 
considered only “operators” on “actual” numbers like 1, 2,3,.. .. 

The point of the previous paragraphs is simply that we often extend one kind of number 
system in order to have a new system with useful properties. The complex numbers extend 
the field axioms above beyond the “real” numbers by adding a number i that solves the 
equation = —1. (See the CD Chapter 29 of the main text.) Hundreds of years ago this 
number was controversial and is still called “imaginary.” In fact, all numbers are useful 
constructs of our imagination and some aspects of Dedekind’s “real” numbers are much 
more abstract than = — 1. (For example, since the reals are “uncountable,” “most” real 
numbers have no description what-so-ever.) 

The rationals are not “complete” in the sense that the linear measurement of the side 
of an equilateral right triangle {V2) cannot be expressed as p/q for p and q integers. In 
Section 1.3 we “complete” the rationals to form Dedekind’s “real” numbers. These numbers 
correspond to perfect measurements along an ideal line with no gaps. 

The complex numbers cannot be ordered with a notion of “smaller than” that is compat- 
ible with the field operations. Adding an “ideal” number to serve as the square root of —1 is 
not compatible with the square of every number being positive. When we make extensions 
beyond the real number system we need to make choices of the kind of extension depending 
on the properties we want to preserve. 

Hyperreal numbers allow us to compute estimates or limits directly, rather than making 
inverse proofs with inequalities. Like the complex extension, hyperreal extension of the reals 
loses a property; in this case completeness. Hyperreal numbers are explained beginning in 
Section 1.4 below and then are used extensively in this background book to show how many 
intuitive estimates lead to simple direct proofs of important ideas in calculus. 

The hyperreal numbers (discovered by Abraham Robinson in 1961) are still controver- 
sial because they contain infinitesimals. However, they are just another extended modern 
number system with a desirable new property. Hyperreal numbers can help you understand 
limits of real numbers and many aspects of calculus. Results of calculus could be proved 
without infinitesimals, just as they could be proved without real numbers by using only 
rationals. Many professors still prefer the former, but few prefer the latter. We believe that 
is only because Dedekind’s “real” numbers are more familiar than Robinson’s, but we will 
make it clear how both approaches work as a theoretical background for calculus. 

There is no controversy concerning the logical soundness of hyperreal numbers. The use 
of infinitesimals in the early development of calculus beginning with Leibniz, continuing with 
Euler, and persisting to the time of Gauss was problematic. The founders knew that their 
use of infinitesimals was logically incomplete and could lead to incorrect results. Hyperreal 
numbers are a correct treatment of infinitesimals that took nearly 300 years to discover. 
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1. Numbers 



With hindsight, they also have a simple description. The Function Extension Axiom 2.1 
explained in detail in Chapter 2 was the missing key. 



[ Exercise set 1.1 1 

1. Show that the identity numbers 0 and 1 are unique. (HINT: Suppose O' + a = a. Add 
—a to both sides.) 

2. Show that 0 • a = 0. (HINT: Expand (O + f ) • a with the distributive law and show that 
0 • a + b = b. Then use the previous exercise.) 

3. The inverses —a and ^ are unique. (HINT: Suppose not, 0 = a— a = a + b. Add —a 
to both sides and use the associative property.) 

4. Show that —1 • a = —a. (HINT: Use the distributive property on 0 = (1 — 1) • a and use 
the uniqueness of the inverse.) 

5. Show that (—1) • (—1) = 1- 

6. Other familiar properties of algebra follow from the axioms, for example, if a^^O and 
04 ^ 0, then 

Oi +02 Oi 02 Oi • 02 Oi 02 „ , . 

= 1 , = • & O3 • 04 + 0 

03 03 03 03 • 04 03 04 



1.2 Order Axioms 



Estimation is based on the inequality < of the real numbers. 



One important representation of rational and real numbers is as measurements of distance 
along a line. The additive identity 0 is located as a starting point and the multiplicative 
identity 1 is marked off (usually to the right on a horizontal line). Distances to the right 
correspond to positive numbers and distances to the left to negative ones. The inequality 
< indicates which numbers are to the left of others. The abstract properties are as follows. 

Axiom 1.2. Ordered Field Axioms 

A a number system is an ordered field if it satisfies the field Axioms 1.1 and has a 
relation < that satisfies: 

• Every pair of numbers a and b satisfies exactly one of the relations 

a = b, a < b, or b < a 

• If a < b and b < c, then a < c. 

• If a < b, then a + c < b + c. 

• // 0 < a and 0 < b, then 0 < a • b. 

These axioms have simple interpretations on the number line. The first order axiom says 
that every two numbers can be compared; either two numbers are equal or one is to the left 
of the other. 
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The second axiom, called transitivity, says that if a is left of b and b is left of c, then a is 
left of c. 

The third axiom says that if a is left of b and we move both by a distance c, then the 
results are still in the same left-right order. 

The fourth axiom is the most difficult abstractly. All the compatibility with multiplication 
is built from it. 

The rational numbers satisfy all these axioms, as do the real and hyperreal numbers. The 
complex numbers cannot be ordered in a manner compatible with the operations of addition 
and multiplication. 

Definition 1.3. Absolute Value 

If a is a nonzero number in an ordered field, |a| is the larger of a and —a, that is, 
|a| = a if —a < a and |a| = —a if a < —a. We let |0| = 0. 



[ Exercise set 1.2 1 

1. I/O < a, show that —a < 0 by using the additive property. 

2. Show that 0 < 1. (HINT: Recall the exercise that (—1) • (— 1) = 1 and argue by contra- 
diction, supposing 0 < — l.j 

3. Show that a • a > 0 for every a 0. 

4. Show that there is no order < on the complex numbers that satisfies the ordered field 
axioms. 

5. Prove that if a < b and c > 0, then c ■ a < c ■ b. 

Prove that if 0 < a < b and 0 < c < d, then c • a < d ■ b. 



1.3 The Completeness Axiom 



Dedekind’s “real” numbers represent points on an ideal line with no gaps. 



The number -\/2 is not rational. Suppose to the contrary that -\/2 = q/r for integers q 
and r with no common factors. Then 2r^ = q^ . The prime factorization of both sides must 
be the same, but the factorization of the squares have an even number distinct primes on 
each side and the 2 factor is left over. This is a contradiction, so there is no rational number 
whose square is 2. 

A length corresponding to \p2 can be approximated by (rational) decimals in various 
ways, for example, 1 < 1.4 < 1.41 < 1.414 < 1.4142 < 1.41421 < 1.414213 < .... There 
is no rational for this sequence to converge to, even though it is “trying” to converge. For 
example, all the terms of the sequence are below 1.41422 < 1.4143 < 1.415 < 1.42 < 1.5 < 2. 
Even without remembering a fancy algorithm for finding square root decimals, you can test 
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1. Numbers 



the successive decimal approximations by squaring, for example, 1.41421^ = 1.9999899241 
and 1.414222 ^ 2.0000182084. 

It is perfectly natural to add a new number to the rationals to stand for the limit of 
the better and better approximations to Similarly, we could devise approximations 
to 7T and make tt the number that stands for the limit of such successive approximations. 
We would like a method to include “all such possible limits” without having to specify the 
particular approximations. Dedekind’s approach is to let the real numbers be the collection 
of all “cuts” on the rational line. 



Definition 1.4. A Dedekind Cut 

A “cut” in an ordered field is a pair of nonempty sets A and B so that: 

• Every number is either in A or B. 

• Every a in A is less than every b in B. 



We may think of -\/2 defining a cut of the rational numbers where A consists of all rational 
numbers a with a < 0 or < 2 and B consists of all rational numbers b with b^ > 2. There 
is a “gap” in the rationals where we would like to have \/2. Dedekind’s “real numbers” fill 
all such gaps. In this case, a cut of real numbers would have to have 'J2 either in A or in 



B. 



Axiom 1.5. Dedekind Completeness 

The real numbers are an ordered field such that if A and B form a cut in those 
numbers, there is a number r such that r is in either A or in B and all other the 
numbers in A satisfy a < r and in B satisfy r < b. 



In other words, every cut on the “real” line is made at some specific number r, so there 
are no gaps. This seems perfectly reasonable in cases like \/2 and tt where we know specific 
ways to describe the associated cuts. The only drawback to Dedekind’s number system 
is that “every cut” is not a very concrete notion, but rather relies on an abstract notion 
of “every set.” This leads to some paradoxical facts about cuts that do not have specific 
descriptions, but these need not concern us. Every specific cut has a real number in the 
middle. 

Completeness of the reals means that “approximation procedures” that are “improving” 
converge to a number. We need to be more specific later, but for example, bounded in- 
creasing or decreasing sequences converge and “Cauchy” sequences converge. We will not 
describe these details here, but take them up as part of our study of limits below. 

Completeness has another important consequence, the Archimedean Property Theo- 
rem 1.8. We take that up in the next section. The Archimedean Property says precisely that 
the real numbers contain no positive infinitesimals. Hyperreal numbers extend the reals by 
including infinitesimals. (As a consequence the hyperreals are not Dedekind complete.) 
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1.4 Small, Medium and Large Num- 
bers 



Hyperreal numbers give us a way to simplify estimation by adding infinites- 
imal numbers to the real numbers. 



We want to have three different intuitive sizes of numbers, very small, medium size, and 
very large. Most important, we want to be able to compute with these numbers using the 
same rules of algebra as in high school and separate the ‘small’ parts of our computation. 
Hyperreal numbers give us these computational estimates. Hyperreal numbers satisfy three 
axioms which we take up separately below. Axiom 1.7, Axiom 1.9, and Axiom 2.1. 

As a first intuitive approximation, we could think of these scales of numbers in terms of 
the computer screen. In this case, ‘medium’ numbers might be numbers in the range -999 to 
-|- 999 that name a screen pixel. Numbers closer than one unit could not be distinguished by 
different screen pixels, so these would be ‘tiny’ numbers. Moreover, two medium numbers 
a and b would be indistinguishably close, a « 6, if their difference was a ‘tiny’ number less 
than a pixel. Numbers larger in magnitude than 999 are too big for the screen and could 
be considered ‘huge.’ 

The screen distinction sizes of computer numbers is a good analogy, but there are diffi- 
culties with the algebra of screen - size numbers. We want to have ordinary rules of algebra 
and the following properties of approximate equality. For now, all you should think of is 
that « means ‘approximately equals.’ 

(a) If p and q are medium, so are p-\- q and p ■ q. 

(b) If £T and S are tiny, so is £ -I- 6, that is, e « 0 and 5 « 0 implies e -I- i5 « 0. 

(c) If 5 « 0 and q is medium, then q ■ 6 ^ 0. 

(d) I/O is still undefined and 1/a; is huge only when a; « 0. 



You can see that the computer number idea does not quite work, because the approximation 
rules don’t always apply, lip = 15.37 and q = —32.4, then p-q = —497.998 « —498, ‘medium 
times medium is medium,’ however, ii p = 888 and q = 777, then p ■ q is no longer screen 
size... 

The hyperreal numbers extend the ‘real’ number system to include ‘ideal’ numbers that 
obey these simple approximation rules as well as the ordinary rules of algebra and trigonom- 
etry. Very small numbers technically are called infinitesimals and what we shall assume that 
is different from high school is that there are positive infinitesimals. 



Definition 1.6. Infinitesimal Number 

A number S in an ordered field is ealled infinitesimal if it satisfies 



1 1 1 
2>3>4> 



>->•••> 1^1 

m 



for any ordinary natural eounting number m = 1, 2, 3, • • • . We write a 
a is infinitely close to b if the number 6 — o « 0 is infinitesimal. 



b and say 



This definition is intended to include 0 as “infinitesimal. 
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1. Numbers 



Axiom 1.7. The Infinitesimal Axiom 

The hyperreal numbers contain the real numbers, but also contain nonzero infinites- 
imal numbers, that is, numbers i5 « 0, positive, <5 > 0, but smaller than all the real 
positive numbers. 

This stands in contrast to the following result. 

Theorem 1.8. The Archimedean Property 

The hyperreal numbers are not Dedekind complete and there are no positive in- 
finitesimal numbers in the ordinary reals, that is, if r > 0 is a positive real number, 
then there is a natural counting number m such that 0 < ;^ < r. 

Proof; 

We define a cut above all the positive infinitesimals. The set A consists of all numbers a 
satisfying a < 1 /m for every natural counting number m. The set B consists of all numbers 
b such that there is a natural number m with 1 jm < b. The pair A, B defines a Dedekind 
cut in the rationals, reals, and hyperreal numbers. If there is a positive S in A, then there 
cannot be a number at the gap. In other words, there is no largest positive infinitesimal or 
smallest positive non-infinitesimal. This is clear because S < d-|-5 and 26 is still infinitesimal, 
while if s is in B, e/2 < e must also be in B. 

Since the real numbers must have a number at the “gap,” there cannot be any positive 
infinitesimal reals. Zero is at the gap in the reals and every positive real number is in B. 
This is what the theorem asserts, so it is proved. Notice that we have also proved that the 
hyperreals are not Dedekind complete, because the cut in the hyperreals must have a gap. 

Two ordinary real numbers, a and b, satisfy a ~ b only if a = 6, since the ordinary real 
numbers do not contain infinitesimals. Zero is the only real number that is infinitesimal. 

If you prefer not to say ‘infinitesimal,’ just say ‘d is a tiny positive number’ and think 
of « as ‘close enough for the computations at hand.’ The computation rules above are still 
important intuitively and can be phrased in terms of limits of functions if you wish. The 
intuitive rules help you find the limit. 

The next axiom about the new “hyperreal” numbers says that you can continue to do 
the algebraic computations you learned in high school. 

Axiom 1.9. The Algebra Axiom (Including < rules.) 

The hyperreal numbers are an ordered field, that is, they obey the same rules of 
ordered algebra as the real numbers. Axiom 1.1 and Axiom 1.2. 

The algebra of infinitesimals that you need can be learned by working the examples and 
exercises in this chapter. 

Functional equations like the addition formulas for sine and cosine or the laws of logs 
and exponentials are very important. (The specific high school identities are reviewed in 
the main text CD Chapter 28 on High School Review.) The Function Extension Axiom 2.1 
shows how to extend the non-algebraic parts of high school math to hyperreal numbers. 
This axiom is the key to Robinson’s rigorous theory of infinitesimals and it took 300 years 
to discover. You will see by working with it that it is a perfectly natural idea, as hindsight 
often reveals. We postpone that to practice with the algebra of infinitesimals. 

Example 1.2. The Algebra of Small Quantities 
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Let’s re-calculate the increment of the basic cubic using the new numbers. Since the rules 
of algebra are the same, the same basic steps still work (see Example 1.1), except now we 
may take x any number and 5x an infinitesimal. 

Small Increment of f[x] = x^ 

f[x + (5x] = {x + Sx)^ = x^ + 3x^Sx + 3xSx^ + Sx^ 
f[x + 5x] = f[x] + 3x^ Sx + (i5a;[3a; -I- <5x]) Sx 
f[x + <5x] = f[x] + f'[x] Sx + e 6x 

with f'[x] = 3x^ and e = (5x[3a; -I- i5x]). The intuitive rules above show that £ « 0 whenever 
X is finite. (See Theorem 1.12 and Example 1.8 following it for the precise rules.) 

Example 1.3. Finite Non-Real Numbers 

The hyperreal numbers obey the same rules of algebra as the familiar numbers from high 
school. We know that r-|- A > r, whenever A > 0 is an ordinary positive high school number. 
(See the addition property of Axiom 1.2.) Since hyperreals satisfy the same rules of algebra, 
we also have new finite numbers given by a high school number r plus an infinitesimal, 

a = r S > r 

The number a = r -|- i5 is different from r, even though it is infinitely close to r. Since S is 
small, the difference between a and r is small 

0<a — t = (5r:! 0 or a«r but a yf r 

Here is a technical definition of “finite” or “limited” hyperreal number. 

Definition 1.10. Limited and Unlimited Hyperreal Numbers 

A hyperreal number x is said to be finite (or limited) if there is an ordinary natural 
number m = 1, 2, 3, • • • so that 

\x\ < m. 

If a number is not finite, we say it is infinitely large (or unlimited). 

Ordinary real numbers are part of the hyperreal numbers and they are finite because 
they are smaller than the next integer after them. Moreover, every finite hyperreal number 
is near an ordinary real number (see Theorem 1.11 below), so the previous example is the 
most general kind of finite hyperreal number there is. The important thing is to learn to 
compute with approximate equalities. 

Example 1.4. A Magnified View of the Hyperreal Line 

Of course, infinitesimals are finite, since (5 « 0 implies that |<5| < 1. The finite numbers are 
not just the ordinary real numbers and the infinitesimals clustered near zero. The rules of 
algebra say that if we add or subtract a nonzero number from another, the result is a different 
number. For example, tt — S < tt < 7t-|-i5, when 0 < (5 « 0. These are distinct finite hyperreal 
numbers but each of these numbers differ by only an infinitesimal, 7r^7r-t-S^7r — S. If 
we plotted the hyperreal number line at unit scale, we could only put one dot for all three. 
However, if we focus a microscope of power l/i5 at tt we see three points separated by unit 
distances. 
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The basic fact is that finite numbers only differ from reals by an infinitesimal. (This is 
equivalent to Dedekind’s Completeness Axiom.) 

Theorem 1.11. Standard Parts of Finite Numbers 

Every finite hyperreal number x differs from some ordinary real number r by an 
infinitesimal amount, x — rfuO or x ^ r. The ordinary real number infinitely near 
X is called the standard part of x, r = st(x). 

Proof: 

Suppose a; is a finite hyperreal. Define a cut in the real numbers by letting A be the 
set of all real numbers satisfying a < x and letting B be the set of all real numbers b with 
X < b. Both A and B are nonempty because x is finite. Every a in A is below every b 
in B by transitivity of the order on the hyperreals. The completeness of the real numbers 
means that there is a real r at the gap between A and B. We must have x ~ r, because if 
X — r > 1/m, say, then r + l/(2m) < x and by the gap property would need to be in B. 

A picture of the hyperreal number line looks like the ordinary real line at unit scale. 
We can’t draw far enough to get to the infinitely large part and this theorem says each 
finite number is indistinguishably close to a real number. If we magnify or compress by new 
number amounts we can see new structure. 

You still cannot divide by zero (that violates rules of algebra), but if d is a positive 
infinitesimal, we can compute the following: 

—6, i What can we say about these quantities? 

0 

The idealization of infinitesimals lets us have our cake and eat it too. Since i5 0, we 
can divide by S. However, since 6 is tiny, 1/6 must be HUGE. 

Example 1.5. Negative infinitesimals 

In ordinary algebra, if A > 0, then —A < 0, so we can apply this rule to the infinitesimal 
number 6 and conclude that — 15 < 0, since i5 > 0. 

Example 1.6. Orders of infinitesimals 

In ordinary algebra, if 0 < A < 1, then 0 < A^ < A, so 0 < <5^ < <5. 

We want you to formulate this more exactly in the next exercise. Just assume 6 is 
very small, but positive. Formulate what you want to draw algebraically. Try some small 
ordinary numbers as examples, like 6 = 0.01. Plot 6 at unit scale and place 6^ accurately 
on the figure. 

Example 1.7. Infinitely large numbers 








Small, Medium and Large Numbers 
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For real numbers if 0 < A < 1/n then n < 1/A. Since <5 is infinitesimal, 0 < i5 < 1/n 
for every natural number n = 1,2,3,... Using ordinary rules of algebra, but substituting 
the infinitesimal 6, we see that H = 1/S > n is larger than any natural number n (or is 
“infinitely large”), that is, 1 < 2 < 3 < . . . < n < i/, for every natural number n. We can 
“see” infinitely large numbers by turning the microscope around and looking in the other 
end. 

The new algebraic rules are the ones that tell us when quantities are infinitely close, 
a ~ b. Such rules, of course, do not follow from rules about ordinary high school numbers, 
but the rules are intuitive and simple. More important, they let us ‘calculate limits’ directly. 

Theorem 1.12. Computation Rules for Finite and Infinitesimal Numbers 

(a) If p and q are finite, so are p + q and p ■ q. 

(b) If e and S are infinitesimal, so is e + S. 

(c) If S ss 0 and q is finite, then q ■ S ss 0. (finite x infsml = infsml) 

(d) 1/0 is still undefined and 1/x is infinitely large only when a; « 0. 

To understand these rules, just think of p and q as “fixed,” if large, and 5 as being as 
small as you please (but not zero). It is not hard to give formal proofs from the definitions 
above, but this intuitive understanding is more important. The last rule can be “seen” on 
the graph oiy = 1/x. Look at the graph and move down near the values x « 0. 




Proof; 

We prove rule (c) and leave the others to the exercises. If q is finite, there is a natural 
number m so that jgl < m. We want to show that |<7 • <5| < 1/n for any natural number n. 
Since <5 is infinitesimal, we have |(5| < l/(n • to). By Exercise 1.2.5, |g| ■ \S\ < m ■ 

Example 1.8. y = ^ dy = 3x^ dx, for finite x 

The error term in the increment of f[x] = x^, computed above is 

e = (<5x[3x + i5x]) 

If X is assumed finite, then 3x is also finite by the first rule above. Since 3x and Sx are finite, 
so is the sum 3x + Sx by that rule. The third rule, that says an infinitesimal times a finite 
number is infinitesimal, now gives Sxx finite = 5x[3x + 5x\ = infinitesimal, e « 0. This 
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justifies the local linearity of at finite values of x, that is, we have used the approximation 
rules to show that 

f[x + (5a;] = f[x] + f'[x] Sx + e 6x 

with e « 0 whenever (5a; « 0 and x is finite, where f[x] = x^ and f'[x] = 3 x^. 



[ Exercise set 1.4 1 

1 . Draw the view of the ideal number line when viewed under an infinitesimal microscope 
of power 1 /( 5 . Which number appears unit size? How big does ( 5 ^ appear at this scale? 
Where do the numbers S and 6^ appear on a plot of magnification \jS^? 

2 . Backwards microscopes or compression 

Draw the view of the new number line when viewed under an infinitesimal microscope 
with its magnification reversed to power S (not 1/S). What size does the infinitely large 
number H (HUGE) appear to be? What size does the finite (ordinary) number m = 10® 
appear to be? Can you draw the number H^ on the plot? 

3 . y = xP ^ dy = pxP~^ dx, p= 1 , 2 , 3 , . . . 

For each f[x] = x^ below: 

(a) Compute f[x + Sx] — f[x] and simplify, writing the increment equation: 

f[x + Sx] — f[x] = f'[x] ■ Sx + s ■ Sx 

= [term in x but not Sx]Sx + [observed microscopic error]Sx 

jix 3x\ — 

Notice that we can solve the increment equation for s = ^ /'[x] 

(b) Show that £ « 0 if Sx ~ Q and x is finite. Does x need to be finite, or can it be 
any hyperreal number and still have e « 0 ? 



(1) 


vm 


= 


then 


f 


[x] 


= lx® = 1 


and £ = 


0. 


(2) 


ifm 


= 


then 


f 


[x] 


= 2x and e 


: = 


= Sx. 




(3) 


ifm 


= x^ 


then 


f 


[x] 


= 3x^ and 




= (3x 


+ Sx)Sx. 


(4) 


Ifm 


= X^J 


then 


f 


[x] 


= 4x® and 




= (6x 


^ + 4x(5x + Sx^)Sx. 


(5) 


Ifm] 


= x^ 


then 


f 


[x] 


= 5x^ and 


£: 


= (10; 


r® + 10x^(5x + 5x(5x^ + Sx^)S: 



4 . Exceptional Numbers and the Derivative oi y = — 

X 

(a) Let f[x] = 1/x and show that 

f[x + Sx] - f[x] ^ -1 

Sx x{x + Sx) 

(b) Compute 

-1 J__ 1 

x{x + Sx)~^ x"^ ^ x‘^{x + Sx) 

(c) Show that this gives 

f[x + Sx] — f[x] = f'[x] ■ Sx + e ■ Sx 

when f'[x] = —1/x^. 

(d) Show that £ « 0 provided x is NOT infinitesimal (and in particular is not zero.) 





Small, Medium and Large Numbers 



5 . Exceptional Numbers and the Derivative oi y — 

(a) Let f[x] = \fx and compute 



f[x + 6x] - f[x] = 



(b) Compute 



e = ^ 3— = ^ Sx 

\/x + Sx + ^/x 2y/x{\Jx + bx + -v/ir)^ 



(c) Show that this gives 

f[x + 5x] — f[x] = f[x] ■ 6x + e ■ Sx 

when f[x] = 

(d) Show that £ « 0 provided x is positive and NOT infinitesimal (and in particular 
is not zero.) 

6 . Prove the remaining parts of Theorem 1.12. 




16 



1. Numbers 




CHAPTER 

2 ] 



Functional Identities 



In high school you learned that trig functions satisfy certain iden- 
tities or that logarithms have certain “properties. ” This chapter 
extends the idea of functional identities from specific cases to a 
defining property of an unknown function. 



The use of “unknown functions” is of fundamental importance in calculus, and other 
branches of mathematics and science. For example, differential equations can be viewed as 
identities for unknown functions. 

One reason that students sometimes have difficulty understanding the meaning of deriva- 
tives or even the general rules for finding derivatives is that those things involve equations in 
unknown functions. The symbolic rules for differentiation and the increment approximation 
defining derivatives involve unknown functions. It is important for you to get used to this 
“higher type variable,” an unknown function. This chapter can form a bridge between the 
specific identities of high school and the unknown function variables from rules of calculus 
and differential equations. 



2.1 Specific Functional Identities 

All the the identities you need to recall from high school are: 



(Cos[a;])^ -1- (Sin[x])^ = 1 


Circlelden 


Cos[x -\-y] = Cos[x] Cos[y] — Sin[x] Sin[y] 


CosSum 


Sin[x -f y] = Sin[x] Cos[y] -I- Sin[y] Cos [a;] 


SinSum 


b^+v = by 


ExpSum 


{b^)y = b^y 


RepeatedExp 


Log[x ■ y] = Log[x] -f Log[y] 


LogProd 


Log[xP] = p Log[x] 


LogPower 



but you must be able to use these identities. Some practice exercises using these familiar 
identities are given in main text CD Chapter 28. 
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2.2 General Functional Identities 



A general functional identity is an equation which is satisfied by an unknown 
function (or a number of functions ) over its domain. 



The function 

/W = 2" 

satisfies f[x + y] = = 2^2^ = f[x\f[y], so eliminating the two middle terms, we see 

that the function f[x] = 2“ satisfies the functional identity 

(ExpSum) f[x + y] = f[x] f[y] 

It is important to pay attention to the variable or variables in a functional identity. In order 
for an equation involving a function to be a functional identity, the equation must be valid for 
all values of the variables in question. Equation (ExpSum) above is satisfied by the function 
f[x] = 2“ for all x and y. For the function f[x] = x, it is true that f[2 + 2] = /[2]/[2], but 
/[3 + 1] yf /[3]/[l], so = X does not satisfy functional identity (ExpSum). 

Functional identities are a sort of ‘higher laws of algebra.’ Observe the notational simi- 
larity between the distributive law for multiplication over addition, 

m-{x + y) = m- x + m- y 

and the additive functional identity 

(Additive) f[x + y] = f[x] + f[y] 

Most functions f[x] do not satisfy the additive identity. For example, 

- + - and VxTy ^ Vx + ^/y 

The fact that these are not identities means that for some choices of x and y in the domains 
of the respective functions f[x] = If x and f[x] = ^/x, the two sides are not equal. You 
will show below that the only differentiable functions that do satisfy the additive functional 
identity are the functions f[x] = m ■ x. In other words, the additive functional identity is 
nearly equivalent to the distributive law; the only unknown (differentiable) function that 
satisfies it is multiplication. Other functional identities such as the 7 given at the start of 
this chapter capture the most important features of the functions that satisfy the respective 
identities. For example, the pair of functions f[x] = Ijx and g[x] = \/x do not satisfy the 
addition formula for the sine function, either. 

Exatnpl6 2.1. The Microscope Equation 

The “microscope equation” defining the differentiability of a function f[x] (see Chapter 
5 of the text). 



(Micro) 



f[x + 6x] = f[x] + f'[x] ■ 5x + e ■ 5x 






General Functional Identities 
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with e « 0 if i5a; « 0, is similar to a functional identity in that it involves an unknown 
function f[x] and its related unknown derivative function f'[x]. It “relates” the function 
f[x] to its derivative ^ = f'[x]. 

You should think of (Micro) as the definition of the derivative of f[x] at a given x, but 
also keep in mind that (Micro) is the definition of the derivative of any function. If we let 
f[x] vary over a number of different functions, we get different derivatives. The equation 
(Micro) can be viewed as an equation in which the function, f[x], is the variable input, and 
the output is the derivative 

To make this idea clearer, we rewrite (Micro) by solving for ) 



# _ f[x + Sx] - f[x] _ ^ 

dx Sx 

or 

# ^ f[x+Sx] - f[x] 

dx Ax^o Ax 

If we plug in the “input” function f[x] = x"^ into this equation, the output is ^ = 2x. If we 
plug in the “input” function f[x] = Log[x], the output is ^ = y- The microscope equation 
involves unknown functions, but strictly speaking, it is not a functional identity, because 
of the error term e (or the limit which can be used to formalize the error). It is only an 
approximate identity. 

Example 2.2. Rules of Differentiation 

The various “differentiation rules,” the Superposition Rule, the Product Rule and the 
Chain Rule (from Chapter 6 of the text) are functional identities relating functions and 
their derivatives. For example, the Product Rule states: 



d{f[x\g[x\) 

dx 






We can think of f[x] and g[x] as “variables” which vary by simply choosing different actual 
functions for f[x] and g[x]. Then the Product Rule yields an identity between the choices 
of f[x] and g[x], and their derivatives. For example, choosing f[x] = x^ and g[x] = Log[x] 
and plugging into the Product Rule yields 



d(x^ Log[x]) 
dx 



2xLog[a;] + x^ 



Choosing f[x] = x^ and g[x\ = Exp[x] and plugging into the Product Rule yields 



d{x^ Exp[a;]) 
dx 



= 3x^ Exp[a;] + x^ Exp[a;] 



If we choose f[x] = x^, but do not make a specific choice for g[x], plugging into the 
Product Rule will yield 



djx^gjx]) 

dx 



5x^(/[a 



dx 



The goal of this chapter is to extend your thinking to identities in unknown functions. 
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[ Exercise set 2.2 1 

1 . (a) Verify that for any positive number, b, the function f[x] = b^ satisfies the func- 

tional identity (ExpSum) above for all x and y. 

(b) Is (ExpSum) valid (for all x and y) for the function f[x] = or f[x] = x^ ? 
Justify your answer. 

2 . Define f[x] = Log[x] where x is any positive number. Why does this f[x] satisfy the 
functional identities 



(LogProd) f[x-y] = f[x] + f[y] 

and 

(LogPower) f[x^] = kf[x] 



where x, y, and k are variables. What restrictions should be placed on x and y for the 
above equations to be valid? What is the domain of the logarithm? 

3 . Find values of x and y so that the left and right sides of each of the additive formulas 
for \ jx and \/x above are not equal. 

4 . Show that 1/x and ^Jx also do not satisfy the identity (SinSum), that is, 



1 

x-Iy 



~Vy + Vx 

x 



1 

y 



is false for some choices of x and y in the domains of these functions. 

5 . (a) Suppose that f[x] is an unknown function which is known to satisfy (LogProd) 

(so f[x] behaves “like” Log[x], but we don’t know if f[x] is Log[x]j, and suppose 
that /[O] is a well-defined number (even though we don’t specify exactly what /[O] 
is). Show that this function f[x] must be the zero function, that is show that 
f[x] = 0 for every x. (Hint: Use the fact that 0 * x = Oj. 

(b) Suppose that f[x] is an unknown function which is known to satisfy (LogPower) 
for all X > 0 and all k. Show that /[I] must equal 0, /[I] = 0. (Hint: Fix x = 1, 
and try different values ofk). 

6 . (a) Let m and b be fixed numbers and define 

f[x] = mx -\-b 

Verify that if b = 0, the above function satisfies the functional identity 



(Mult) f[x]=x /[I] 

for all X and that if b ^ 0, f[x] will not satisfy (Mult) for all x (that is, given a 
nonzero b, there will be at least one x for which (Mult) is not true). 

(b) Prove that any function satisfying (Mult) also automatically satisfies the two 
functional identities 



(Additive) f[x + y] = f[x\ + f[y\ 

and 



(Multiplicative) 
for all x and y. 



f[xy] = X f[y] 





The Function Extension Axiom 
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(c) Suppose f[x] is a function which satisfies (Mult) (and for now that is the only 
thing you know about f[x]). Prove that f[x] must be of the form f[x] = m • x, 
for some fixed number m (this is almost obvious). 

(d) Prove that a general power function, f[x] = mx^ where k is a positve integer and 
m is a fixed number, will not satisfy (Mult) for all x if k ^ 1, (that is, if k ^ 1, 
there will be at least one x for which (Mult) is not true). 

(e) Prove that f[x] = Sin[a;] does not satisfy the additive identity. 

(f) Prove that f[x] = 2“ does not satisfy the additive identity. 

7 . (a) Let f[x] and g[x] be unknown functions which are known to satisfy /[I] = 2, 

£(1) = 3, g(l) = -3, g(l) = 4. Let h{x) = f[x]g[x\. Compute £(1). 

(b) Differentiate the general Product Rule identity to get a formula for 

dHfg) 

dx^ 

Use your rule to compute (1) if (1) = 5 and = —2, using other 

values from part 1 of this exercise. 



2.3 The Function Extension Axiom 



This section shows that all real functions have hyperreal extensions that are 
“natural” from the point of view of properties of the original function. 



Roughly speaking, the Function Extension Axiom for hyperreal numbers says that the 
natural extension of any real function obeys the same functional identities and inequalities 
as the original function. In Example 2.7, we use the identity, 

f[x + Sx] = f[x] ■ f[Sx] 

with X hyperreal and i5a; « 0 infinitesimal where f[x] is a real function satisfying f[x + y] = 
f[x] ■ f[y]. The reason this statement of the Function Extension Axiom is ‘rough’ is because 
we need to know precisely which values of the variables are permitted. Logically, we can 
express the axiom in a way that covers all cases at one time, but this is a little complicated 
so we will precede that statement with some important examples. 

The Function Extension Axiom is stated so that we can apply it to the Log identity in 
the form of the implication 

(x > 0 & y > 0) Log[x] and Log[i/] are defined and Log[cc • y] = Log[x] + Log[y] 

The natural extension of Log[-] is defined for all positive hyperreals and its identities hold for 
hyperreal numbers satisfying x > 0 and y > 0. The other identities hold for all hyperreal x 
and y. To make all such statements implications, we can state the exponential sum equation 
as 

{x = X & y = y) = e® • e*' 
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The differential 

d{Sm[0]) = Cos[6»] d0 

is a notational summary of the valid approximation 

Sin[6» + 50] - Sin[6<] = Cos[6»]<56» + e ■ 50 

where £ « 0 when 50 « 0. The derivation of this approximation based on magnifying a 
circle (given in a CD Section of Chapter 5 of the text) can be made precise by using the 
Function Extension Axiom in the place where it locates (Cos[0 + 50], Sin[0 + 50]) on the unit 
circle. This is simply using the extension of the (Circlelden) identity to hyperreal numbers, 
(Cos[0 + 50]f + (Sin[0 + 50])2 = 1. 

Logical Real Expressions, Formulas and Statements 

Logical real expressions are built up from numbers and variables using functions. Here 
is the precise definition. 

(a) A real number is a real expression. 

(b) A variable standing alone is a real expression. 

(c) If El, E 2 , • • • , En are a real expressions and f[xi,X 2 ,--- , Xn] is a real function of n 
variables, then /[Ei, E 2 , • • • , E„] is a real expression. 

A logical real formula is one of the following: 

(a) An equation between real expressions, Ei = E 2 . 

(b) An inequality between real expressions, Ei < E 2 , Ei < E 2 , Ei > E 2 , Ei > E 2 , or 
El yf E 2 . 

(c) A statement of the form “E is defined’ or of the form “E is undefined.” 

Let S and T be finite sets of real formulas. A logical real statement is an implication of the 
form, 

S^T 

or “whenever every formula in S is true, then every formula in T is true. ” 

Logical real statements allow us to formalize statements like: “Every point in the square 
below lies in the circle below.” Formalizing the statement does not make it true or false. 
Consider the figure below. 



y 




Figure 2.1: Square and Circle 




The Function Extension Axiom 
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The inside of the square shown can be described formally as the set of points satisfying the 
equations in the set S' = { 0 < x, 0 < y, a; < 1.2, y < 1.2 }. The inside of the circle shown can 
be defined as the set ofpoints satisfying the single equation T = { (x— l)^ + (y— 1)^ < 1.6^ }. 
This is the circle of radius 1.6 centered at the point (1, 1). The logical real statement S T 
means that every point inside the square lies inside the circle. The statement is true for 
every real x and y. First of all, it is clear by visual inspection. Second, points {x, y) that 
make one or more of the formulas in S false produce a false premise, so no matter whether 
or not they lie in the circle, the implication is logically true (if uninteresting). 

The logical real statement T S' is a valid logical statement, but it is false since it 
says every point inside the circle lies inside the square. Naturally, only true logical real 
statements transfer to the hyperreal numbers. 

Axiom 2.1. The Function Extension Axiom 

Every logical real statement that holds for all real numbers also holds for all hyper- 
real numbers when the real functions in the statement are replaced by their natural 
extensions. 

The Function Extension Axiom establishes the 5 identities for all hyperreal numbers, 
because x = x and y = y always holds. Here is an example. 

Example 2.3. The Extended Addition Formula for Sine 



S={x = x,y = y}^T={ Sin[a:] is defined , 

Sin[y] is defined , 

Cos[a;] is defined , 

Cos[y] is defined , 

Sin[a: + y] = Sin [a;] Cos[y] + Sin[y] Cos [a;]} 

The informal interpretation of the extended identity is that the addition formula for sine 
holds for all hyperreals. 

Example 2.4. The Extended Formulas for Log 

We may take S to be formulas a; > 0, y > 0 and p = p and T to be the functional 
identities for the natural log plus the statements “Log[ ] is defined,” etc. The Function 
Extension Axiom establishes that log is defined for positive hyperreals and satisfies the two 
basic log identities for positive hyperreals. 

Example 2.5. Abstract Uses of Function Extension 

There are two general uses of the Function Extension Axiom that underlie most of the 
theoretical problems in calculus. These involve extension of the discrete maximum and 
extension of finite summation. The proof of the Extreme Value Theorem 4.4 below uses a 
hyperfinite maximum, while the proof of the Fundamental Theorem of Integral Calculus 5. 1 
uses hyperfinite summation. 

Equivalence of infinitesimal conditions for derivatives or limits and the “epsilon - delta” 
real number conditions are usually proved by using an auxiliary real function as in the proof 
of the limit equivalence Theorem 3.2. 
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Example 2.6. The Increment Approximation 
Note: The increment approximation 

f[x + (5a;] = f[x] + f'[x] ■ Sx + e ■ Sx 



with £ « 0 for (5a; « 0 and the simpler statement 



(5 a; « 0 



/'W 



f[x]+Sx) - f[x] 
5x 



are not real logical expressions, because they contain the relation which is not included 
in the formation rules for logical real statements. (The relation « does not apply to ordinary 
real numbers, except in the trivial case x = y.) 

For example, if 9 is any hyperreal and 50 « 0, then 



Sin[0 + 69] = Sin[0] Cos [50] + Sin [50] Cos[0] 

by the natural extension of the addition formula for sine above. Notice that the natural 
extension does NOT tell us the interesting and important estimate 

Sin[0 + 50] = Sin[0] + 50 • Cos[0] + e • 50 

with £ « 0 when 50 « 0. (I.e., Cos[50] = 1 + l69 and Sin[50]/50 « 1 are true, but not real 
logical statements we can derive just from natural extensions.) 



[ Exercise set 2.3 1 

1. Write a formal logical real statement S ^ T that says, “Every point inside the circle 
of radius 2, centered at (—1, 3) lies outside the square with sides x = 0, y = 0, x = 1, 
y = —1. Draw a figure and decide whether or not this is a true statement for all real 
values of the variables. 

2. Write a formal logical real statement S ^ T that is equivalent to each of the functional 
identities on the first page of the chapter and interpret the extended identities in the 
hyperreals. 



2.4 Additive Functions 



An identity for an unknown function together with the increment approxi- 
mation combine to give a specific kind of function. The two ideas combine 
to give a differential equation. After you have learned about the calculus 
of the natural exponential function in Chapter 8 of the text, you will easily 
understand the exact solution of the problem of this section. 








Additive Functions 
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In the early 1800s, Cauchy asked the question: Must a function satisfying 
(Additive) f[x + y] = f[x] + f[y] 

be of the form f[x] = m- xl This was not solved until the late 1800s by Hamel. The answer 
is “No.” There are some very strange functions satisfying the additive identity that are not 
simple linear functions. However, these strange functions are not differentiable. We will 
solve a variant of Cauchy’s problem for differentiable functions. 

Example 2.7. A Variation on Cauchy’s Problem 

Suppose an unknown differentiable function f[x] satisfies the (ExpSum) identity for all 
X and y, 

f[x + y] = f[x] ■ f[y] 

Does the function have to be f[x] = b’” for some positive 6? 

Since our unknown function f[x] satisfies the (ExpSum) identity and is differentiable, 
both of the following equations must hold: 

f[x + y] = f[x] ■ f[y] 

f[x + 5x] = f[x] + f'[x] ■ 5x + e ■ 5x 

We let y = 5x in the first identity to compare it with the increment approximation, 

f[x + 5x] = f[x] ■ f[Sx] 
f[x + (5a;] = f[x] + f'[x] ■ Sx + e ■ Sx 
so 

f[x] ■ f[5x] = f[x] + f'[x] -Sx + e-Sx 
f[x] [f[Sx] - 1] = f'[x] ■ Sx + e ■ Sx 

or 

f[x] ^ f[Sx] - 1 
f[x] Sx 

with £ « 0 when « 0. The identity still holds with hyperreal inputs by the Function 
Extension Axiom. Since the left side of the last equation depends only on x and the right 
hand side does not depend on x at all, we must have « fc, a constant, or ^ k 

as Ax — > 0. In other words, a differentiable function that satisfies the (ExpSum) identity 
satisfies the differential equation 

^ = kf 
dx 

What is the value of our unknown function at zero, /[O]? For any x and y = 0, we have 

f[x] = f[x + 0] = f[x] ■ /[O] 

so unless f[x] = 0 for all x, we must have /[O] = 1. 

One of the main morals of this course is that if you know: 

(1) where a quantity starts. 
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(2) how a quantity changes, 

then you can compute subsequent values of the quantity. In this problem we have found 
(1) /[O] = 1 and (2) ^ = k f. We can use this information with the computer to calculate 
values of our unknown function f[x]. The unique symbolic solution to 

/[O] = 1 

df 



fix] = 

The identity (Repeated Exp) allows us to write this as 

f[x] = = {e'^y = 

where b = e^. In other words, we have shown that the only differentiable functions that 
satisfy the (ExpSum) identity are the ones you know from high school, b^ . 

Problem 2.1. Smooth Additive Functions ARE Linear ^ 

Suppose an unknown function is additive and differentiable, so it satisfies both 



(Additive) 



f[x + Sx] = f[x] + f[Sx] 



(Micro) 



f[x + Sx] = f[x] + f[x] ■ Sx + e ■ Sx 



Solve these two equations for f'[x] and argue that since the right side of the equation does 
not depend on x, f'[x] must be constant. (Or ^ f'[xi] and f'[x 2 ], but since 

the left hand side is the same, f'[xi] = f'[x 2 ]-) 

What is the value of f[0] if f[x] satisfies the additive identity? 

The derivative of an unknown function f[x] is constant and /[O] = 0, what can we say 
about the function? (Hint: Sketch its graph.) 

k 



A project explores this symbolic kind of ‘linearity’ and the microscope equation from 
another angle. 



2.5 The Motion of a Pendulum 



Differential equations are the most common functional identities which arise 
in applications of mathematics to solving “real world” problems. One of the 
very important points in this regard is that you can often obtain significant 
information about a function if you know a differential equation the function 
satisfies, even if you do not know an exact formula for the function. 





The Motion of a Pendulum 



27 



For example, suppose you know a function 9\t] satisfies the differential equation 



dt‘^ 



Sin[0 [t]] 



This equation arises in the study of the motion of a pendulum and 9\t\ does not have a 
closed form expression. (There is no formula for Suppose you know 0[O] = Then 

the differential equation forces 



^[0] = Sin[0[O]] = Sin[|] = 1 



We can also use the differential equation for 9 to get information about the higher deriva- 
tives of 9[t]. Say we know that ^[0] = 2. Differentiating both sides of the differential 
equation yields 



S9 



C„s[«Wlf 



by the Chain Rule. Using the above information, we conclude that 



^[O] = Cos[0[O]]|[O] = Cos[|]2 = O 



Problem 2.2. 

Derive a formula for ^ and prove that ^ [0] = 1 . 



T 

A 
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The Theory of Limits 



The intuitive notion of limit is that a quantity gets close to a “lim- 
iting” value as another quantity approaches a value. This chapter 
defines two important kinds of limits both with real numbers and 
with hyperreal numbers. The chapter also gives many computations 
of limits. 



A basic fact about the sine function is 



lim 

rc— 



Sin[x] 

X 



= 1 



Notice that the limiting expression is defined for 0 < |a: — 0| < 1, but not if a; = 0. The 
sine limit above is a difficult and interesting one. The important topic of this chapter is, 
“What does the limit expression mean?” Rather than the more “practical” question, “How 
do I compute a limit?” 

Here is a simpler limit where we can see what is being approached. 



a: — 1 

While this limit expression is also only defined for 0<|x — 1|, ora;yfl, the mystery is 
easily resolved with a little algebra. 



— 1 (x — l)(x + 1) 
a; — 1 (a; — 1) 



So, 

x^ — 1 

lim = lim (a; + 1) = 2 

x^l X — \ x^l 

The limit lim 2 ;^i(a; + 1) = 2 is so obvious as to not need a technical definition. If x is 
nearly 1, then a; + 1 is nearly 1 + 1 = 2. So, while this simple example illustrates that the 
original expression does get closer and closer to 2 as a; gets closer and closer to 1, it skirts 
the issue of “how close?” 
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3. The Theory of Limits 



3.1 Plain Limits 



Technically, there are two equivalent ways to define the simple continuous 
variable limit as follows. 



Definition 3.1. Limit 

Let f[x] be a real valued function defined for 0 < |a; — a| < A with A a fixed positive 
real number. We say 

lim f[x] = b 

x^a 

when either of the equivalent the conditions of Theorem 3.2 hold. 



Theorem 3.2. Limit of a Real Variable 

Let f[x] be a real valued function defined for 0 < |a; — a| < A with A a fixed positive 
real number. Let b be a real number. Then the following are equivalent: 

(a) Whenever the hyperreal number x satisfies 0 < |x — a| « 0, the natural 
extension function satisfies 

f[x] « b 

(b) For every accuracy tolerance 0 there is a sufficiently small positive real num- 
ber 7 such that if the real number x satisfies 0 < jcc — a| < 7 , then 

\f[x] -b\<9 



Proof: 

We show that (a) =7 (b) by proving that not (b) implies not (a), the contrapositive. 
Assume (b) fails. Then there is a real 0 > 0 such that for every real 7 > 0 there is a real x 
satisfying 0 < |a: — a| < 7 and \ f[x] — b\ > 9. Let ^[ 7 ] = a; be a real function that chooses 
such an x for a particular 7 . Then we have the equivalence 



{7 > 0} <t7 |A[ 7 ] is defined , 0 < |A[ 7 ] — a| < 7 , |/[A 1 [ 7 ] — b\ > 9} 



By the Function Extension Axiom 2.1 this equivalence holds for hyperreal numbers and 
the natural extensions of the real functions A[-] and /[•]. In particular, choose a positive 
infinitesimal 7 and apply the equivalence. We have 0 < |A[ 7 ] — o| < 7 and |/[A[ 7 ] — b\ > 9 
and 0 is a positive real number. Hence, /[A[ 7 ]] is not infinitely close to b, proving not (a) 
and completing the proof that (a) implies (b) . 

Conversely, suppose that (b) holds. Then for every positive real 9, there is a positive real 
7 such that 0 < |x — a| < 7 implies \f[x] — 6 | < 9. By the Function Extension Axiom 2.1, 
this implication holds for hyperreal numbers. If ^ « a, then 0 < |^ — a| < 7 for every real 7 , 
so Iflf] — b\ < 9 for every real positive 9. In other words, /[^] « b, showing that (b) implies 
(a) and completing the proof of the theorem. 

Example 3.1. Condition (a) Helps Prove a Limit 
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Suppose we wish to prove completely rigorously that 



lim 
Aa;— »■ 



1 

0 2(2 -t“ 



1 

4 



The intuitive limit computation of just setting Ax = 0 is one way to “see” the answer, 



lim 
Ax— » 



1 

0 2(2 -t“ Ax) 



1 _ 1 
2(2 + 0 “ 4 



but this certainly does not demonstrate the “epsilon - delta” condition (b) . 

Condition (a) is almost as easy to establish as the intuitive limit computation. We wish 
to show that when Jx « 0 

1 ^ 1 
2(2 + Sx) 4 

Subtract and do some algebra, 

1 1 2 (2 + (5x) 

2(2 + 6x) ~4~ 4(2 + Sx) ~ 4(2 + Sx) 

—Sx —1 

— ^ . 

4(2 + Sx) 4(2 + (5x) 



We complete the proof using the computation rules of Theorem 1.12. The fraction — 1/(4(2+ 
i5x)) is finite because 4(2 + Sx) « 8 is not infinitesimal. The infinitesimal Sx times a finite 
number is infinitesimal. 



2(2 + (5x) 4 

1 ^ 1 
2(2 + (5x) 4 

This is a complete rigorous proof of the limit. Theorem 3.2 shows that the “epsilon - delta” 
condition (b) holds. 



[ Exercise set 3.1 1 

1. Prove rigorously that the limit limAx^o 3 ( 3 +ax) “ I’ your choice of condition (a) 
or condition (b) from Theorem 3.2. 

2. Prove rigorously that the limit limAx^o ~ 3- your choice of condition 

(a) or condition (b) from Theorem 3.2. 

3. The limit lima;^o = 1 means that sine of a small value is nearly equal to the value, 
and near in a strong sense. Suppose the natural extension of a function f[x] satisfies 
/[^] « 0 whenever ^ « 0. Does this mean that lima;^o exists? (HINT: What is 
limaj^ov^? What is ^/f?) 

4. Assume that the derivative of sine is cosine and use the increment approximation 

f[x + 5x] — f[x] = f'[x] ■ 6x + e ■ 6x 
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with £ « 0 when Sx « 0, to prove the limit lima;^o = 1. (It means essentially the 
same thing as the derivative of sine at zero is 1. HINT: Take a; = 0 and Sx = x in the 
increment approximation.) 



3.2 Function Limits 



Many limits in calculus are limits of functions. For example, the derivative 
is a limit and the derivative of x^ is the limit function 3x^. This section 
defines the function limits used in differentiation theory. 



Example 3.2. A Function Limit 

The derivative of x^ is 3 x'^, a function. When we compute the derivative we use the limit 

ix + Acc)^ — x^ 

lim 

Ax 

Again, the limiting expression is undefined at Ax = 0. Algebra makes the limit intuitively 
clear, 



(x + Ax)^ — x^ (x^ + 3 x^ Ax + 3 X Ax^ + Ax^) — x^ 



Ax Ax 

The terms with Ax tend to zero as Ax tends to zero, 
(x + Ax)^ — x^ 



= 3 x^ + 3 X Ax + Ax^ 



lim 

Ate— 



Ax 



= lim (3x^ + 3xAx + Ax"^) = 3x"^ 



This is clear without a lot of elaborate estimation, but there is an important point that 
might be missed if you don’t remember that you are taking the limit of a function. The 
graph of the approximating function approaches the graph of the derivative function. This 
more powerful approximation (than that just a particular value of x) makes much of the 
theory of calculus clearer and more intuitive than a fixed x approach. Intuitively, it is no 
harder than the fixed x approach and infinitesimals give us a way to establish the “uniform” 
tolerances with computations almost like the intuitive approach. 

Definition 3.3. Locally Uniform Function Limit 

Let f[x] and F[x, Ax] he real valued functions defined when x is in a real interval 
(a, b) and 0 < Ax < A with A a fixed positive real number. We say 

^lim F[x,Ax] = f[x] 

uniformly on compact subintervals of{a,b), or “locally uniformly” when one of the 
equivalent the conditions of Theorem 3.4 holds. 







Function Limits 
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Theorem 3.4. Limit of a Real Function 

Let f[x] and F[x, Ax] be real valued functions defined when x is in a real interval 
(a, b) and 0 < Ax < A with A a fixed positive real number. Then the following are 
equivalent: 

(a) Whenever the hyperreal numbers 6x and x satisfy 0 < |5x| « 0, x is finite, 
and a < x < b with neither x « a nor x « the natural extension functions 
satisfy 

F[x,6x] « f[x] 

(b) For every accuracy tolerance 0 and every real a and f3 in (a,b), there is 
a sufficiently small positive real number 7 such that if the real number Ax 
satisfies 0 < |Ax| < 7 and the real number x satisfies a < x < j3, then 

|F[x, Ax] - /[x]l < e 



Proof: 

First, we prove not (b) implies not (a). If (b) fails, there are real a and P, a < a < f3 < b, 
and real positive 0 such that for every real positive 7 there are x and Ax satisfying 

0 < Ax < 7 , a < X < P, ]F[x, Ax] — /[x]J > 9 
Define real functions A [ 7 ] and DA [ 7 ] that select such values of x and Ax, 

0 <DA[ 7 ]< 7 , a<X[j]<p, ]D[A[ 7 ],DA[ 7 ]]-/[A[ 7]]1 >0 

Now apply the Function Extension Axiom 2.1 to the equivalent sets of inequalities, 

{7 > 0} 47 {0 < DA[ 7 ] < 7 , a<X[j]<p, 1 D[A[ 7 ],DA[ 7 ]]-/[A[ 7 ]]]> 0 } 

Choose an infinitesimal 7 « 0 and let x = A [ 7 ] and Sx = DX [ 7 ] . Then 
0 <( 5 x < 7 « 0 , a < X < P, \F[x,6x] — f[x]\ > 9 

so F[x,5x] — f[x] is not infinitesimal showing not (a) holds and proving (a) implies (b). 

Now we prove that (b) implies (a). Let Sx be a non zero infinitesimal and let x satisfy 
the conditions of (a). We show that F[x, Sx] « f[x] by showing that for any positive real 9, 
\F[x, Sx] — /[x]l < 9. Fix any one such value of 9. 

Since x is finite and not infinitely near a nor b, there are real values a and P satisfying 
a < a < P < b. Apply condition (b) to these a and P together with 9 fixed above. Then 
there is a positive real 7 so that for every real ^ and Ax satisfying 0 < jAxj < 7 and 
Oi < f < P, we have ]F[f, Ax] — /[^]j < 9. In other words, the following implication holds in 
the real numbers. 



{0 < JAx] <j,a<f<P}^ {]F[^, Ax] - f[f]] < 9} 

Apply the Function Extension Axiom 2.1 to see that the same implication holds in the 
hyperreals. Moreover, x = f and nonzero Ax = <5x « 0 satisfy the left hand side of the 
implication, so the right side holds. Since 9 was arbitrary, condition (a) is proved. 

Example 3.3. Computing Locally Uniform Limits 
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The following limit is uniform on compact subintervals of (— 00 , 00 ). 

lim = lim (3a:^ + 3xAa: + Aa;^) = 3a;^ 

Ax — tO Ax — tO 

A complete rigorous proof based on condition (a) can be obtained with the computation 
rules of Theorem 1.12. The difference is infinitesimal 

(3 a;^ + 3 a; + Sx^) — = {Sx + Sx)6x 

when Sx is infinitesimal. First, 3 x+6x is finite because a sum of finite numbers is finite. Sec- 
ond, infinitesimal times finite is infinitesimal. This is a complete proof and by Theorem 3.4 
shows that both conditions (b) and (c) also hold. 



[ Exercise set 3.2 j 

1 . Prove rigorously that the locally uniform function limit limAx^o x{x+Ax) ~ 
your choice of condition (a) or condition (b) from Theorem 3.4. 

2. Prove rigorously that the locally uniform function limit limAx^o ^x+Ax+^ ~ 

Use your choice of condition (a), condition (b), or condition (c) from Theorem 3.4. 

3. Prove the following: 

Theorem 3.5. Locally Uniform Derivatives 

Let f[x] and f'[x] be real valued functions defined when x is in a real interval 
(a,b). Then the following are equivalent: 

(a) Whenever the hyperreal numbers Sx and x satisfy fe « 0, a; is finite, 
and a < X < b with neither x ^ a nor x ^ b, the natural extension 
functions satisfy 

f[x + Sx] — f[x] = f'[x] ■ Sx + e ■ Sx 

for £ « 0 . 

(b) For every accuracy tolerance 9 and every real a and (3 in (a, b), there is 
a sufficiently small positive real number 7 such that if the real number 
Aa; satisfies 0 < |Aa;| < 7 and the real number x satisfies a < x < (3, 
then 



(c) For every real c in (a, b), 



lim ■ 

x—^c.Ax—^0 



■ Aa;] - f[x] 



= f[c] 



That is, for every real c with a < c < b and every real positive 9, 
there is a real positive 7 such that if the real numbers x and Ax satisfy 
0 < Ax < 7 and 0 < ja; — c| < 7 , then \ _ f']c] \ < 9. 
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3.3 Computation of Limits 



Limits can he computed in a way that rigorously establishes them as results 
by using the rules of Theorem 1.12. 



Suppose we want to compute a limit like 

{x + d)'^ — 

lim ; 

d^o d 

First we observe that it does no good to substitute d = 0 into the expression, because we 
get 0/0. We do some algebra, 

(x + d)'^ — x'^ x^ + 2xd+ d^ — x^ 
d ^ d 

2 xd + d^ 

^ d 
= 2x + d 



Now, 

{X + — X“^ T r, 

lim ; = lim 2 a; + d = 2 a; 

d — »^0 d d — »^0 

because making d smaller and smaller makes the expression closer and closer to 2 x. The 
rules of small, medium and large numbers given in Theorem 1 . 12 just formalize the kinds of 
operations that work in such calculations. Theorem 3.2 and Theorem 3.4 show that these 
rules establish the limits as rigorously proven facts. 

Exercise 3.3.1 below contains a long list of drill questions that may be viewed as limit 
computations. For example. 



d^o d \b + d b 

is just asking what happens as d becomes small. Another way to ask the question is 

7 ( when d « 0 

S \b + 6 bj 

The latter approach is well suited to direct computations and can be solved with the rules 
of Theorem 1.12 that formalize our intuitive notions of small, medium and large numbers. 

Following are some sample calculations with parameters a, b, c, d, e, id, K from Exer- 
cise 3.3.1. 

Example 3.4. Infinitesimal, Finite and Infinite Computations 

We are told that a « 2 and 6 « 5, so we may write a = 2 + l and b = 5 + 0 with i « 0 
and d « 0. Now we compute b — a = 5 + 6 — 2 — i, = 5 — 2+{6 — l) = 3 + {9 — l) by rules 
of algebra. The negative of an infinitesimal is infinitesimal and the sum of a positive and 
negative infinitesimal is infinitesimal, hence d — t « 0. This makes 
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Another correct way to do this computation is the following 

a « 2 

6-a«5-2=3 

However, this is NOT use of ordinary rules of algebra, because ordinary rules of algebra do 
not refer to the infinitely close relation This form of the computation incorporates the 
fact that negatives of infinitesimals are infinitesimal and the fact that sums of infinitesimals 
are infinitesimal. 

Example 3.5. Small, Medium and Large as Limits 

The approximate computations can be re-phrased in terms of limits. We can replace the 
occurrences of <5 « 0 and e « 0 by variables approaching zero and so forth. Let’s just do 
this by change of alphabet, lim^^o for 5 « 0 and lima ^2 for 0 ^ 2 . 

The computation 6 — a « 3 can be viewed as the limit 

lim B — a = 3 

The computation (2 — i5)/a « 1 becomes 

lim ^ = 1 

d^O^a.^2 Q; 

The computation ^ becomes 

\/a + d — sfoi 1 

lim = — — 

d^o&Q^2 d 2v2 



Example 3.6. Hyperreal Roots 

When a « 2 and c « —7, the Function Extension Axiom guarantees in particular that 
^/a is defined and that ^/c is undefined, since a > 0 is positive and c < 0 is negative. The 
computation rules may be used to show that ^/a « that is, we do not need any more 
rules to show this. First, y/a is finite, because a < 3 implies 

\/a < ^/3 < 2 

by the Function Extension Axiom. Next, 

_ ^2 = (-\/q — + 'J% 

\fa 
a — 2 
^/a -\- \/2 
1 

\fa \/2 

an infinitesimal times a finite number, by approximation rule (4). Finally, approximation 
rule (3) shows 

\fa — \/2 « 0 or \/a « \/2 
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Example 3.7. A Limit of Square Root 
The “epsilon - delta” proof (condition (b) of Theorem 3.2) of 

lim = 0 

x—^0 

is somewhat difficult to prove. We establish the equivalent condition (a) as follows. 

Let 0 < ^ « 0 be a positive infinitesimal. Since 0 > a: implies yLx is defined and positive, 
The Function Extension Axiom 2.1 guarantees that ^/f is defined and positive. 

Suppose is not infinitesimal. Then there is a positive real number 0 < a with a < 
Squaring and using the Function Extension Axiom on the property 0 < 5 < c implies 0 < 
\/b < -y/c, we see that 0 < ^ contradicting the assumption that ^ « 0 is infinitesimal. 

Example 3.8. Infinite Limits 

We know that c + 7 yf 0 because we are given that c yf — 7 and c«— 7orc = — 7+t with 
i « 0, but t yf 0. This means that c + 7 = 6 yf 0 and so 

c + 7« 0 

This, together with what we know about reciprocals of infinitesimals tells us that 

is infinite 

c + 7 

We do not know if it is positive or negative; we simply weren’t told whether c < 7 or c > 7, 
but only that c~ 7. 

In this example, the limit formulation has the result 

lim does not exist or lim 

7^7 7 — 7 7^7 

The precise meaning of these symbols is as follows. 

Definition 3.6. Infinite Limits 

Let f[x] be a real function defined for a neighborhood of a real number a, 0 < 

|x — a| < A. We say 

lim f[x] = oo 

x^a 

provided that for every large positive real number B, there is a sufficiently small 
real tolerance, r such that if 0 < |cc — a| < r then f[x] > B. 

The symbol oo means “gets bigger and bigger.” This is equivalent to the following 
hyperreal condition. 

Theorem 3.7. A Hyperreal Condition for Infinite Limits 

Let f[x] be a real function defined for a neighborhood of a real number a, 0 < 

\x — a\ < A. The definition of liirix^a f[x] = oo is equivalent to the following. 

For every hyperreal x infinitely close to a, but distinct from it, the natural ex- 
tension satisfies, f[x] is a positive infinite hyperreal. 

Proof: Left as an exercise below. 

Example 3.9. oo is NOT Hyperreal 



7-7 



= +00 
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The symbol oo cannot stand for a number because it does not obey the usual laws of 
algebra. Viewed as a numerical equation oo • oo = oo says that we must have oo = 1 or 
oo = 0, by ordinary rules of algebra. Since we want to retain the rules of algebra, oo in the 
sense of ‘very big’ can not be a hyperreal number. 



Example 3.10. An Infinite Limit with Roots 



The limit 



X^O \x\ 



+00 



Proof; 

Let 0 < ^ « 0. We know from the previous example that 
and the Function Extension Axiom that 



0. We know from algebra 



Using Theorem 1.12, we see that this expression is infinitely large. 

Example 3.11. Indeterminate Forms 

Even though arguments similar to the ones we have just done show that a + fe+cwOwe 
can not conclude that is defined. For example, we might have a = 2 — S, b = 5 + 36 

and c = — 7 — 26. Then a « 2, 6 « 5 and c « —7, but a + 6 + c = 0. (Notice that it is true 
that a + 6 + c « 0.) In this case is not defined. Other choices of the perturbations 

might make a + b + c 0, so is defined (and positive or negative infinite) in some 

cases, but not in others. This means that the value of 



1 

a + 6 + c 



can not be determined knowing only that the sum is infinitesimal. 

In Webster’s unabridged dictionary, the term “indeterminate” has the following symbolic 
characters along with the verbal definition 

-, — , oo • 0, 1°°, 0°, oo°, 00 — 00 

0 oo 

In the first place, Webster’s definition pre-dates the discovery of hyperreal numbers. The 
symbol oo does NOT represent a real or hyperreal number, because things like oo • oo = oo 
only denote ‘limit of big times limit of big is big.’ The limit forms above do not have a 
definite outcome. 

Each of the symbolic short-cuts above has a hyperreal number calculation with indeter- 
minate outcomes in the sense that they may be infinitesimal, finite or infinite depending on 
the particular infinitesimal, finite or infinite numbers in the computation. In this sense, the 
older infinities are compatible with infinitely large hyperreal numbers. 

Example 3.12. The Indeterminate Form oo — oo 

Consider oo — oo. The numbers H and L = H + 6 are both infinite numbers, but 
H — L = —6 is infinitesimal. The numbers K and M = K + b are both infinite and 
K — M Ks —5. The numbers H and N = are both infinite and H — N = (1 — H) ■ H is 
a negative infinite number. 
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We may view the symbolic expression “oo — oo is indeterminate” as a short-hand for the 
fact that the difference between two infinite hyperreal numbers can be positive or negative 
infinite, positive or negative and finite or even positive or negative infinitesimal. Of course, 
it can also be zero. 

Example 3.13. The Indeterminate Form 0 • oo 

The short-hand symbolic expression “0 • oo is indeterminate” corresponds to the following 
kinds of choices. Suppose that H = 1/5. Then 5 ■ H = 1. An infinitesimal times an infinite 
number could be finite. Suppose K = so K is infinite. Now 5 ■ K = H is infinite. 
An infinitesimal times an infinite number could be infinite. Finally, suppose e = i5^. Then 
e ■ K = 8^/5“^ = 5^ is infinitesimal. 

The following is just for practice at using the computation rules for infinitesimal, finite, 
and infinite numbers from Theorem 1.12 to compute limits rigorously. These computations 
prove that the “epsilon - delta” conditions hold. 




1. Drill with Rules of Infinitesimal, Finite and Infinite Numbers 
In the following formulas, 



0 < e « 0 and 0 < <5 « 0, H and K are infinite and positive. 

a « 2, 6 « 5, c « —7, but a yf 2, 6 yf 5, c yf —7 

Say whether each expression is infinitesimal, finite (and which real number it is near), 
infinite, or indeterminate (that is, could be in different categories depending on the 
particular values of the parameters above.) 



1 


y = 


e X 6 


2 


y = 


e — 5 


3 


y = 


efb 


4 


V = 


e/S 


5 


y = 


a-\-7e 

b-iS 


6 


y = 


hje 


7 


y = 


a + b — c 


8 


y = 


a + 5 


9 


y = 


c — e 


10 


y = 


a — 2 


11 


y = 


1 

a— 2 


12 


y = 


1 

a—b 


13 


y = 


C 

a—b 


u 


y = 


2-5 

a 


15 


y = 


5(5^-3(5^+25 

S 


16 


y = 


1 

H 


17 


y = 


2-5 

a-K 


18 


y = 


5(5^-3(52+25 

45+52 


19 


y = 


H^+3H 

H 


20 


y = 




21 


y = 


352 

(5+8(5^ 


6)6) 




H-K 


6)0 




H-K 






H-K 




y = 


H 


/CO 


y = 


H K 




y = 


H-\-K 


25 


y = 


Vh 


26 


y = 


V~5 


27 


y = 


1 + 


28 


y = 


Vh 

H-\-a 


29 


y = 


\/a + 5 — \fa 


30 


y = 


1 1 

b-\-5 b 
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2 . Re-write the problems of the previous exercise as limits. 

3 . Prove Theorem 3 . 7 . 




CHAPTER 

4 _ 



Continuous Functions 



A function f[x] is continuous if a small change in its input only 
produces a small change in its output. This chapter gives some 
fundamental consequences of this property. 



Definition 4.1. Continuous Function 

Suppose a real function f[x] is defined in a neighborhood of a, |a; — o| < A. We 
say f[x] is continuous at a if whenever x ^ a in the hyperreal numbers, the natural 
extension satisfies f[x] « f[a\. 

Notice that continuity assumes that f[a] is defined. The function 

/H = ^ 

is technically not continuous at a; = 0, but since lima;^o/[a;] = 1 we could extend the 
definition to include /[O] = 1 and then the function would be continuous. 

Theorem 4.2. Continuity as Limit 

Suppose a real function f[x] is defined in a neighborhood of a, |a; — o| < A. Then 
f[x] is eontinuous at a if and only iflim.x^a f[x] = f[a\. 

Proof; 

Apply Theorem 3.2. 

We show in Section ?? that differentiable functions are continuous, so rules of calculus 
give us an easy way to verify that a function is continuous. 

4.1 Uniform Continuity 

A function is uniformly continuous if given an “epsilon,” the same “delta” 
works “uniformly” for all x. 



The simplest intervals are the ones of finite length that include their endpoints, [a, b], for 
numbers a and b. These intervals are sometimes described as ‘closed and bounded,’ because 
they have the endpoints and have bounded length. A shorter name is ‘compact’ intervals. 
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4. Continuous Functions 



Every hyperreal number satisfying a < x < b is near a real number x ~ c with a < c < b. 
First, the hyperreal x has a standard part since it is finite. Second, c must lie in the 
interval because real numbers r outside the interval are a noninfinitesimsl distance from the 
endpoints. We cannot have x ~ r and r a noninfinitesimal distance from the interval. 

The fact that every hyperreal point of a set is near a standard point of the set is equivalent 
to the “finite covering property” of general topologically compact spaces. The hyperreal 
condition is easy to apply directly. The following theorem illustrates this (although we do 
not need the theorem later in the course.) 

Theorem 4.3. Continuous on a Compact Interval 

Suppose that a real function f[x] is defined and continuous on the compact real 
interval [a,b] = {x \ a < x < b} . Then for every real positive 9 there is a real 
positive 7 such that if \x\ — X2\ <7 in [a, b], then \ f[xi] — f[x2]\ < 9. 

Proof; 

Since f[x] is continuous at every point of [a, &], if ^ « c for a < c < b, then f[f] « f[c]. 

Further, since the interval [o, b] includes its real endpoints, if a hyperreal number x 
satisfies a < x < b then its standard part from Theorem 1 . 11 c lies in the interval and x « c. 

Let xi and X2 be any two points in [o, 6] with x\ « X2- Both of these numbers have the 
same standard part (since the real standard parts have to be infinitely close and real, hence 
equal.) We have 

f[xi] « /[c] « f[x2] 

SO for any numbers xi « X2 in [a, 6], f[xi] « f[x2]- 

Suppose the conclusion of the theorem is false. Then there is a real 0 > 0 such that for 
every 7 > 0 there exist xi and X2 in [a, 6] with |xi — X2I <7 and \f[xi] — f[x2]\ > 9. Define 
real functions and X2\'f\ that select such values and give us the real logical statement 

{7 > 0 } ^ {a < Xi[7] <b,a< X2[y] < b, |Xi[7] - X2[j]\ < 7, |/[^i[ 7 ]] - f[X2h]]\ > 9} 

Now apply the Function Extension Axiom 2.1 to this implication and select a positive 
infinitesimal 7 « 0 . Let xi = Aii[7], X2 = ^2(7] and notice that they are in the interval, 
xi « X2, but f[xi] is not infinitely close to /[X2]. This contradiction shows that the theorem 
is true. 

4.2 The Extreme Value Theorem 



Continuous functions attain their max and min on compact intervals. 



Theorem 4.4. The Extreme Value Theorem 

If f[x] is a continuous real function on the real compact interval [a,b], then f 
attains its maximum and minimum, that is, there are real numbers Xm and xm 
such that a < Xm Si b, a < xm < b, and for all x with a < x <b 

f[Xm\ < f[x]< f[xM] 



Intuitive Proof: 






The Extreme Value Theorem 
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We will show how to locate the maximum, you can find the minimum. Partition the 
interval into steps of size Ax, 

a < a + Ax < a + 2 Ax < ■ ■ ■ < b 

and define a real function 

M[Ax] = the X of the form xi = a + fcAx 

so that 

/[M[Ax] = /[xi] = max[f[x] : x = a + hAx, h = 0,1, - ■ ■ , n] 

This function is the discrete maximum from among a finite number of possibilities, so that 
M[Ax] has two properties: (1) M[Ax] is one of the partition points and (2) all other 
partition points x = a + hAx satisfy f[x] < /[M[Ax]]. 

Next, we partition the interval into infinitesimal steps, 

a<a + Sx<a + 26x <•••<& 

and consider the natural extension of the discrete maximizing function M[Sx]. By the 
Function Extension Axiom 2.1 we know that (1) xi = M[6x] is one of the points in the 
infinitesimal partition and (2) f[x] < f[xi] for all other partition points x. 

Since the hyperreal interval [a, b] only contains finite numbers, there is a real number 
xm ~ xi (standard part) and every other real number X 2 in [a, b] is within Sx of some 
partition point, X 2 « x. 

Continuity of / means that f[x] « f[x 2 ] and /[xm] ~ f[xi]- The numbers X 2 and xm 
are real, so /[X 2 ] and /[xm] are also real and we have 

/[X2] « /[x] < /[xi] « /[xm] 

Thus, for any real X 2 , f[x 2 ] < /[xm], which says / attains its maximum at xm- This 
completes the proof. 

Partition Details of the Proof: 

Let a and b be real numbers and suppose a real function f[x] is defined for a < x < b. 
Let Ax be a positive number smaller than b — a. There are finitely many numbers of the 
form a + k Ax between a and b; a = a + 0 Ax, a + Ax, a + 2 Ax, ■ ■ • ,a + n Ax < b. The 
corresponding function values, f[a], f[a + Ax], f[a + 2 Ax], • • • , f [a + n Ax] have a largest 
amongst them, say f[a + m Ax] > f [a + kAx] for all other k. We can express this with a 
function M[Ax] = a + m Ax “is the place amongst the points a < a + Ax < a + 2 Ax < 
■ ■ ■ < a + n Ax < b such that /[M[Ax]] > f[a + fc Ax].” (There could be more than one, 
but M[Ax] chooses one of them.) 

A better way to formulate this logically is to say, ‘if x is of the form a + k Ax, then 
f[x] < /[M[Ax]].’ This can also be formulated with functions. Let I[x] be the ‘indicator 
function of integers,’ that is I[x] = 1 if x = 0, ±1, ±2, ±3, • • • and I[x] = 0 otherwise. Then 
the maximizing property of M[Ax] = a + m Ax can be summarized by 



a < X < b 



f[x] < /[M[Ax]] 



The rigorous formulation of the Function Extension Axiom covers this case. We take S to 
be the set of formulas. Ax > 0, a < x, x < 6 and I[[x — a]/Ax] = 1 and take T to be the 
inequality f[x] < /[M[Ax]]. The Function Extension Axiom shows that M[Sx] is the place 
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4. Continuous Functions 



where f[x] is largest among points of the form a + k6x, even when (5a; « 0 is infinitesimal, 
but it says this as follows: 



a < X < b and 



I 



X — a 
Sx 




f[x] < f[M[Sx]] 



We interpret this as meaning that, ‘among the hyperreal numbers of the form a + kSx, f[x] 
is largest when x = M[6x],' even when 6x is a positive infinitesimal. 



4.3 Bolzano’s Intermediate Value The- 
orem 



The graphs of continuous functions have no “jumps. 



Theorem 4.5. Bolzano’s Intermediate Value Theorem 

U y = /N continuous on the interval a < x < b, then f[x] attains every 
value intermediate between the values f[a] and f[b]. In particular, if f [a] < 0 and 
f[b] > 0, then there is an xq, a < xq < b, such that f[xo] = 0. 

Proof: 

The following idea makes a technically simple general proof. Suppose we want to hit a 
real value 7 between the values of f[a] = a and f[b] = (3. Divide the interval [a, b] up into 
small steps each Ax long, a, a + Ax, a + 2Ax, a + 3Ax, • • • , b. Suppose a < j < (3. The 
function f[x] starts at x = a with /[a] = a < 7. At the next step, it may still be below 7, 
f[a + Ax] < 7, but there is a first step, a + kAx where f[a+ kAx] > 7 and f[x] <7 for all 
X of the form x = a + hAx with h < k. 



a b 



r 
















U 


L 



















Figure 4.1: [a, b] in steps of size Ax 
We need a general function for this. Let the function 

M[Ax] = Min[x : f[x] >7, a < x < b, x = a + Ax, a + 2 Ax, a + 3 Ax, • • • ] 

= a + kAx 

give this minimal x as a function of the step size Ax. 

The natural extension of this Min function has the property that even when we compute 
at an infinitesimal step size, ^ = M[5x\ satisfies, f[f] > 7, and f[x] < 7 for x = a + hSx < f, 
in particular /[^ — (5xj < 7. Infinitesimals let continuity enter the picture. 

Continuity of f[x] means that if c « x, then f[c] « f[x]. We take c to be the standard 
real number such that c « ^ = M[Sx]. We know f[c] « f[f] > 7 and f[c] « f[f — (5xj < 7. 
Since f[c] must be a real value for a real function at a real input, and since we have just 
shown that f[c] « 7, it follows that f[c] = 7, because ordinary reals can only be infinitely 
close if they are equal. 








1 Variable Differentiation 
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CHAPTER 

_5j 



The Theory of Derivatives 



This chapter shows how the traditional “epsilon- delta” theory, rig- 
orous infinitesimal analysis, and the intuitive approximations of the 
main text are related. 



The chapter shows that 

lim /[^ + - /N ^ uniformly 

Act 

f[x -\- 6x] = f[x] -\- f'[x]Sx -\- e ■ Sx with e«0 for fe « 0 

with all the provisos needed to make both of these exactly formally correct. Then this 
approximation is used to prove some of the basic facts about derivatives. 




The Fundamental Theorem: Part 
1 



We begin with an overview that illustrates the two main approximations of 
calculus, how they fit together, and how the fine details are added if you 
wish to make formal arguments based on an intuitive approximation. 



We re-write the traditional limit for a derivative as an approximation for the differential. 
Then we plug this differential approximation into an approximation of an integral to see 
why the Fundamental Theorem of Integral Calculus is true. The two main approximations 
interact to let us compute integrals by finding antiderivatives. For now, we simply treat 
the symbol “wiggly equals,” «, as an intuitive “approximately equals.” Subsections below 
justify the use both in terms of hyperreal infinitesimals and in terms of uniform limits. The 
point of the section is that the intuitive arguments are correct because we can fill in all the 
details if we wish. 

The Intuitive Derivative Approximation 
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5. The Theory of Derivatives 



The traditional approach to derivatives is the approximation of secant lines approaching 
the slope of the tangent. Symbolically, this is 



lim 

Aa?— tO 



f[x + Ax] - f[x] 

Ax 



/'W 



The intuitive meaning of this formula is that {f[x + Ax] — f[x])/Ax is approximately equal 
to f'[x] when the difference, Ax, is small. Ax = (5x « 0. We write this with an explicit 
error, 

where the error given by the Greek letter epsilon, e, is small, provided that Sx is small. We 
use lower case or small delta to indicate that the approximation is valid for a small difference 
in the value of x. [5 is lower case A. Both stand for “difference” because the difference in 
x-input is Sx = (x + Sx) — x.] 

This approximation can be rewritten using algebra and expressed in the form 



/[x + 5x] — /[x] = /'[x] i5x + £ • i5x with £ « 0 for (5x « 0 



where now the wiggly equals « only means “approximately equals” in an intuitive sense. 
This expresses the change in a function f[x + Sx] — f[x] in moving from x to x + <5x as 
approximately given by a change f'[x] Sx, linear in Sx, with an error e • Sx that is small 
compared to Sx, (e • Sx)/Sx = £ « 0. 

This is a powerful intuitive formulation of the approximation of derivatives. (It is often 
called ‘Landau’s small oh formula.’) This also has a direct geometric counterpart in terms 
of microscopes given in the main text, but here we use it symbolically. 

An Antiderivative 

Suppose that we begin with a function f[x] and know an antiderivative F[x], that is, 

dF 

[x] = fix] for a < X < b 
dx'- ' ' - - 

The approximation above becomes 



F[x + Sx] — F[x] = f[x]Sx + s ■ Sx with £«0 for Sx ^ 0 
Flip this around to tell us 

/[x] i5x = (A[x + <5x] — F[xj) — £ • i5x with £«0 for 5x « 0 



provided that a < x < b. 

Integrals are Sums of Slices 

The main idea of integral calculus is that integrals are ‘sums of slices.’ One way to express 
this is 



f[x] dx 



lim (/[a]Ax + f[a + Ax] Ax + f[a + 2Ax]Ax 
Ate— 

+ f[a + 3Ax]Ax + • • • + f[b — 2Ax]Ax + f[b — Ax]Ax) 



where the sum is over values of f[x] Ax where x starts at a and goes in steps of size Ax 
until we get to the slice ending at b. 




The Fundamental Theorem: Part 1 
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The limiting quantity is approximately equal to the integral when the step size is “small 
enough.” 

f f[x]dx~f[a]Sx + f[a + 5x]6x + --- + f[b—5x]5x for <5a; « 0 
J a 

where « temporarily only means the intuitive “approximately equals.” 

Now we incorporate the differential approximation above at each of the x points, x = a, 
X = a + Sx, ■ ■ ■ , x = b — Sx, in our sum approximation, obtaining 

f f[x] dx « 

J a 

{[F{a + Sx) — F[a]) + {F[a + 2Sx] — F[a + <5a;]) + h {F[b] — F[b — fe]) 

— (e[a, <5x] Sx + e[a + Sx, 5a;]) 5a; + • • • + e[b, 5a;] Sx) 

The first sum ‘telescopes,’ that is, positive leading terms in one summand cancel negative 
second terms in the next, all except for the first and last terms, 

{F[a + 5a;] — F[a\) + {F[a + 25a;] — F[a + 5x]) + • • • + {F[b] — F[b — 5x]) 

= -F[a] + F[b] 

The second sum and can be estimated as follows, 

j£r[a, Sx] Sx + e[a + Sx, Sx] 5x + • • • + e[b — Sx, Sx] Sx] 

< je[a, 5x] j Sx + je[o + Sx, 5a;] j 5a; + • • • + ]e[b — Sx, 5a;] j Sx 

< |eMoa:|(5a; + 5a; H h 5a;) 

< \£Max\(b- a) 

where leMoa:| is the largest of the small errors, e[a;,5a;], coming from the differential approx- 
imation. The sum of Sx enough times to move from a to b is b — a, the distance moved. 

As long as we make the largest error small enough, the summed error less than ]sMax\{b — 
a) will also be small, so 



f f[x] dx « F[b] - F[a] 
J a 



But these are both fixed and do not depend on how small we take Sx, hence 



[ f[x] dx = F[b] - F[a] 

J a 



This intuitive estimation illustrates the Fundamental Theorem of Integral Calculus. 

Theorem 5.1. Fundamental Theorem of Integral Calculus: Part 1 

Suppose the real function f[x] has an antiderivative, that is, a real function F[a;] 
so that the derivative of F[x] satisfies 

dF 

[a;] = f[x] for all x with a < x <b 



f[x] dx = F[b] — F’[a] 
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5. The Theory of Derivatives 



The above is not a formal proof, because we have not kept track of the “approximately 
equal” errors. This can be completed either from the “e — S” theory of limits or by using 
hyperreal infinitesimals. Both justifications follow in separate subsections. 

5.1.1 Rigorous Infinitesimal Justification 

First, we take as our definition of derivative, ^ = f[x], condition (a) of Theorem 3.5: 
for every hyperreal x with a < x < b and every 6x infinitesimal, there is an infinitesimal e 
so that the extended functions satisfy 

F[x + fe] — F[x] = f[x] Sx + e ■ Sx 

The only thing we need to know from the theory of infinitesimals is that 

^Max 



exists in the sense that 



|e[a, 5x] 5x+e[a + 5x, Ja;] (5a; + • • • + e[b — Sx, (5a;] (5a;| 

< |£[a, Sx] I Sx + |£[a + Sx, Sx] | (5a; + • • • + |£[6 — Sx, (5a;] | Sx 

< \£Max\{Sx + (5a; H h (5a;) 

< \£Max\{b - a) 



still holds when Sx is infinitesimal. This follows from the Function Extension Axiom. Let 
£[a;, Ax] be the real function of the real variables x and Ax, 



£[x. Ax] 



F[x + Ax] - F[x] . 

TT 



For each ordinary real Ax, there is an x of the form Xm = a + mAx (m = 1, 2, 3, • • • ) 
so that the inequalities above hold with SMax = £[xm, Ax]. This is just a finite maximum 
being attained at one of the values. Define a real function m[Ax] = Xm- 
Define a real function 



S'[Ax] = ]£[a. Ax] Ax + e[a + Ax, Ax] Ax + • • • + £[£[6 — Ax, Ax] Ax] 

The inequalities above say that for real Ax 

5[Ax] < ]£[m[Ax], Ax]](5 — a) 

The Function Extension Axiom says 

S'[(5x] < j£[m[(5x], (5x] j(6 — a) 

and the definition of derivative says that e[m[Sx], Sx] is infinitesimal, provided Sx is infinites- 
imal. Since an infinitesimal times the finite number (6 — a) is also infinitesimal, we have 
shown that the difference between the real integral and the real answer 

5[(5x] = f f[x] dx — {F[b] — E[a]) 

J a 

is infinitesimal. This means that they must be equal, since ordinary numbers can not differ 
by an infinitesimal unless they are equal. 




Derivatives, Epsilons and Deltas 
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5.1.2 Rigorous Limit Justification 

We need our total error to be small. This total error, Error /„tegrai , is the difference 
between the quantity F[b] — E[a] and the integral, so by the calculation above, 

Frror Integral = Ax] Aa; + e[a + Ax, Ax] Ax + • • • + e[6 — Ax, Ax] Ax 

We know from the calculation above that ] Error /„tegrai | A \sMax\ib — a). If we choose 
an arbitrary error tolerance of 6, then it is sufficient to have ]£Moa:| A dl{b — a), because 
then we will have jError/„tegroi| A 0. This means that we must show that the differential 
approximation 

/[x]Ax = (F[x + Ax] — E[x]) — £ • Ax 

holds with j£j < 6/{b — a) for every x in [a, 5]. Using the algebra above in reverse, this is 
the same as showing that 



F[x + Ax] — F[x] 



- f[x] = £[x. Ax] 



is never more than 9/{b — a), provided that Ax is small enough. The traditional way to say 
this is 

F[x + Ax]-F[x] ^ / / I, 

lim : = f X uniformly for a < x < b 

Ax^o Ax ^ ‘ ~ ~ 



The rigorous definition of the limit in question is: for every tolerance rj and every x in [a, 6], 
there exists a fj, such that if jAxj < /x, then 



F[x + Ax] — F[x] 



- /W! < V 



This is the formal definition of derivative condition (b) of Theorem 3.5. Our proof of the 
Fundamental Theorem is complete (letting rj = 6/{b — a)). The hypothesis says that if we 
can find a function f[x] so that 



F[x + Ax] — F[x] 



= f[x] uniformly for a < x < b 



then the conclusion is 



[ f[x] dx = F[b] - F[a] 

J a 



(Notice that the existence of the limit defining the integral is part of our proof. The function 
f[x] is continuous, because of our strong definition of derivative.) 



5.2 Derivatives, Epsilons and Deltas 



The fundamental approximation defining the derivative of a real valued 
function can be formulated with or without infinitesimals as follows. 
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5. The Theory of Derivatives 



Definition 5.2. The Rigorous Derivative 

In Theorem 3.5 we saw that the following are equivalent definitions of, “The real 
function f[x] is smooth with derivative f'[x] on the interval {a,b).” 

(a) Whenever a hyperreal x satisfies a < x < b and x is not infinitely near a 
or b, then an infinitesimal increment of the naturally extended dependent 
variable is approximately linear, that is, whenever Jcc « 0 



(b) 



f[x + (5a;] — f[x] = f'[x] Sx + s ■ Sx 

for some £ « 0 . 

For every compact subinterval [a, (3] C (a,b), 
f[x + Ax] - f[x] 



lim 
Ax— ^0 



Ax 



= f' [a;] uniformly for a < x < 



in other words, for every accuracy tolerance 9 and every real a and j3 in 
(o, b), there is a sufficiently small positive real number 7 such that if the real 
number Ax satisfies 0 < | Aa;| < 7 and the real number x satisfies a < x < (3, 
then 

|/[a; + Aa;] - f[x] 



Ax 



-/'N 



(c) For every real c in (a, b). 



lim 

^c.Ax- 



/[a: + Aa;] - f[x] 
Ax 



< 9 



= f[c] 



That is, for every real c with a < c < b and every real positive 9, there is a 
real positive 7 such that if the real numbers x and Ax satisfy 0 < Aa; < 7 
and 0 < ]a; — c] < 7 , then \ _ /'[c]j < 9. 



All derivatives computed by rules satisfy this strong approximation provided the formulas 
are valid on the interval. This is proved in Theorem 5.5 below. 



5.3 Smoothness ^ Continuity of Func- 
tion and Derivative 



This section shows that differentiability in the sense of Definition 5.2 implies 
that the function and derivative are continuous. 



One difficult thing about learning new material is putting new facts together. Bolzano’s 
Theorem and Darboux’s Theorem have hypotheses that certain functions are continuous. 
This means you must show that the function you are working with is continuous. How do 
you tell if a function is continuous? You can’t ‘look’ at a graph if you haven’t drawn one 
and are using calculus to do so. What does continuity mean? Intuitively, it just means that 
small changes in the independent variable produce only small changes in the dependent 
variable. 






Smoothness => Continuity of Function and Derivative 
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In Theorem ?? we showed that the following are equivalent definitions 
Definition 5.3. Continuity of f[x] 

Suppose a real function f[x] is defined for at least a small neighborhood of a real 
number a, |x — a| < A, for some positive real A. 

(a) f[x] is continuous at a if whenever a hyperreal x satisfies x ^ a, the natural 
extension satisfies f[x] « /[a]- 

(b) f[x] is continuous at a iflimx^a f[x] = f[a\. 

Intuitively, this just means that f[x] is close to f[a] when x is close to a, for every x ~ a, 
f[x] is defined and 

f[x] « f[a] 

The rules of calculus (together with Theorem 5.5) make it easy to verify that functions 
given by formulas are continuous: Simply calculate the derivative. 

Theorem 5.4. Continuity of f[x] and f'[x] 

Suppose the real function f[x] is smooth on the real interval (a,b) (see Defini- 
tion 5.2). Then both f[x] and f'[x] are continuous at every real point c in (a,b). 

Proof for f[x]: 

Proof of continuity of / is easy algebraically but is obvious geometrically: A graph that is 
indistinguishable from linear in a microscope clearly only moves a small amount in a small 
x-step. Draw the picture on a small scale. 

Algebraically, we want to show that if x\ « X2 then f[xi] « f[x2], condition (a) above. 
Let c be any real point in (a, b) and x = xi = c. Let 5 x = X2 — x\ be any infinitesimal and 
use the approximation f[x2] = f[x-\- 6 x] = f[xi]-\-f'[xi]Sx-\-s 6 x. The quantity [f'[xi]-\-e]Sx 
is medium times small = small, so f[xi] « f[x2], by Theorem 1.12 (c). That is the algebraic 
proof. 

Proof for f'[x]: 

Proof of continuity of f'[x] requires us to view the increment from both ends. First take 
any real c in (a,b), x = xi = c, and 6 x = X2 — xi any nonzero infinitesimal. Use the 
approximation 

f[x 2 ] = f[x + Ax] = f[xi] + f[xi]Sx + ei6x. 

Next let X = X2, Sx = xi — X2 and use the approximation 

f[xi] = f[x + Ax] = f[x 2 ] + f'[x 2 ]Sx + £2(5x. 

The different x-increments are negatives, so we have 

f[xi] - f[x2] = /'[a:2](xi - X2) + £2(a;i - X2) 

and 

f[x 2 ] - f[xi] = /'[Xi](x2 - Xi) + £i(x 2 - Xi) 

Adding, we obtain 

0 = {{f[x2] - f[xi]) + (£2 - £l)) {xi - X2) 

Dividing by the non-zero (xi — X2), we see that 

f[x2] = f[xi] -k (£1 - £2), SO /'[X2] « f[xi] 



Note: 
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The derivative defined in many calculus books is a weaker pointwise notion than the 
notion of smoothness we have defined. The weak derivative function need not be continuous. 
(The same approximation does not apply at both ends with the weak definition.) This is 
explained in Chapter 6 on Pointwise Approximations. 



[ Exercise set 5.3 1 

1. (a) Consider the real function f[x] = l/x, which is undefined at x = 0. We could 

extend the definition by simply assigning f[0] = 0. Show that this function is not 
continuous at x = 0 but is continuous at every other real x. 

(b) Give an intuitive graphical description of the definition of continuity in terms of 
powerful microscopes and explain why it follows that smooth functions must be 
continuous. 

(c) The function f[x] = \fx is defined for x > 0; there is nothing wrong with /[O]. 
However, our increment computation for ^fx above was not valid atx = 0 because 
a microscopic view of the graph focused at x = 0 looks like a vertical ray ( or half- 
line). Explain why this is so, but show that f[x] is still continuous ‘from the 
right;” that is, if0<x~ 0, then y/x « 0 but is very large. 



5.4 


Rules ^ 


Smoothness 




This section shows that when we can compute a derivative by rules, then 
the smoothness Definition 5.2 is automatically satisfied on intervals where 
both formulas are defined. 








Theorem 5.5. Rules Smooth 

Suppose a function y = f[x] is given by a formula to which we can apply the rules 
of Chapter 6 of the main text, obtaining a formula for f'[x]. If both f[x] and 
f'[x] are defined on the real interval (a,b), then then satisfy Definition 5.2 and, by 
Theorem 5.4, are continuous on (a,b). 



Proof: 

This is a special case of Theorem 10.1. 



[ Exercise set 5.4 1 

1. What is the simple way to tell if a function is continuous? 

2 . Suppose that y = f[x] is given by a formula that you can differentiate by the rules of 
calculus from Chapter 6 of the main text. As you know, you can differentiate many 
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many formulas. What property does the function ^ = f[x] have to have so that you 
can conclude f[x] is continuous at all points of the interval a < x < b? (What about 
f[x] = 1/x at x = Q?) 

Give examples of: 

(a) One y = f[x] which you can differentiate by rules and an interval [o, 6] where 
f'[x] is defined and f[x] is continuous on the whole interval. 

(b) Another y = f[x] which you can differentiate by rules, but where f'[x] fails to be 

defined on all of [a, 6] and where f[x] is not continuous at some point a < c < b. 

(Hint: Read Theorem 5.4. What about y = ^fx?) 

What properties does the function ^ = f[x] have to have so that you can conclude 
f'[x] is continuous at all points of the interval a < x < b? 

Give Examples of: 

(a) One y = f[x] which you can differentiate by rules and an interval [o, 5] where 
f'[x] is defined and f'[x] is continuous on the whole interval. 

(b) Another y = f[x] which you can differentiate by rules, but where f'[x] fails to be 

defined on all of [a, b] and where f'[x] is not continuous at some point a < c < b. 

(Note: If f'[x] is undefined at x = c, it cannot be considered continuous at c. 
Well, there is a sticky point here. Perhaps f'[x] could be extended at an undefined 
point so that it would become continuous with the extension. It is actually fairly 
easy to rule this out with one of the functions you have worked with in previous 
homework problems.) 



5.5 The Increment and Increasing 



A positive derivative means a function is increasing near the point. We 
prove this algebraically in this section. 



It is ‘clear’ that if we view a graph in an infinitesimal microscope at a point xq and 
see the graph as indistinguishable from an upward sloping line, then the function must be 
‘increasing’ near xq. Certainly, the graph need not be increasing everywhere - draw y = x^ 
and consider the point xq = 1 with /'[I] = 2. Exactly how should we formulate this? Even 
if you don’t care about the symbolic proof of the algebraic formulation, the formulation 
itself may be useful in cases where you don’t have graphs. 

One way to say f[x] increases near xq would be to say that if x\ < xq < X 2 (and these 
points are not ‘too far’ from xq), then f[xi] < f[xo] < f[x 2 \. Another way to formulate 
the problem is to say that if xi < X 2 (and these points are not ‘too far’ from xq), then 
/[xi] < f[x 2 ]. Surprisingly, the second formulation is more difficult to prove (and even fails 
for pointwise differentiable functions). The second formulation essentially requires that we 
can move the microscope from xq to xi and continue to see an upward sloping line. We 
know from Theorem 5.4 that if xi « xq the slope of the microscopic line only changes an 
small amount, so we actually see the same straight line. 
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Proof 

We will only prove case (a), since the second case is proved similarly. First we verify that 
f[x] is increasing on a microscopic scale. The idea is simple: Compute the change in f[x] 
using the positive slope straight line and keep track of the error. 

Take xi and X 2 so that xq « < X 2 ~ xq. Since f'[xi] « f'[xo] by Theorem 5.4 we may 

write f'[xi] = m + Si where m = f'[xo] and Si « 0. Let Sx = X 2 — xi so 

f[x 2 ] = f[xi + 6x] = f[xi] + f'[xi] ■Sx + S 2 -Sx 

= f[xi] + m ■ 5x + {e\ + £ 2 ) • 5x 

The number m is a real positive number, so m + £1 + £2 > 0 and , since 6x > 0, (m + £1 + 
£ 2 ) • 5x > 0. This means /[X 2 ] — f[xi] > 0 and /[X 2 ] > f[xi]. This proves that for any 
infinitesimal interval [a, j3] with a < xq < P, the function satisfies 

a < Xi < X 2 < P ^ f[xi] < f[x 2 ] 

The Function Extension Axiom guarantees that real numbers a and P exist satisfying the 
inequalities above, since if the equation fails for all real a and P, it would fail for infinitely 
close ones. That completes the proof. 

Example 5.1. A Non-Increasing Function with Pointwise Derivative 1 
The function 

|x + x^Sin[j], if X yf 0 

has a pointwise derivative at every point and = 1 (but is not differentiable in the 

usual sense of Definition 5.2). This function is not increasing in any neighborhood of zero 
(so it shows why the pointwise derivative is not strong enough to capture the intuitive idea 
of the microscope). See Example 6.3.1 for more details. 

5.6 Inverse Functions and Derivatives 

If a function has a nonzero derivative at a point, then it has a local inverse. 
The project on Inverse Functions expands this section with more details. 
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The inverse of a function y = f[x] is the function x = g[y] whose ‘rule un-does what the 
rule for / does.’ 

g[f[x]\ = X 

If we have a formula for f[x], we can try to solve the equation y = f[x] for x. If we are 
successful, this gives us a formula for g[y], 

y = f[x] ^ x = g[y] 

Example 5.2. y = x^ and the Partial Inverse x = y/y 

For example, if y = f[x] = x“^, then x = g[y] = yty, at least when a: > 0. These two 
functions have the same graph if we plot g with its independent variable where the y axis 
normally goes, rather than plotting the input variable of g[y] on the horizontal scale. 



y=f [x] , x>0 




Figure 5.1: y = f[x] and its inverse x = g[y] 



The graph of x = g[y] operationally gives the function g by choosing a y value on the 
y axis, moving horizontally to the graph, and then moving vertically to the x output on 
the X axis. This makes it clear graphically that the rule for g ‘un-does’ what the rule for 
/ does. If we first compute f[x] and then substitute that answer into g[y], we end up with 
the original x. 

Example 5.3. y = f[x] = x^ + x'^ + x^ + x^ + x its Inverse 

The graph of the function y = f[x] = x'^ + x"^ + x^ + x^ + x is always increasing because 
f'[x\ = 9 a;® -1-7 x^+f> x'^+i x^+1 > 0 is positive for all x. Since we know lima,^_oo f[x\ = —oo 
and lima;^+oo f[x] = -l-oo, Bolzano’s Intermediate Value Theorem 4.5 says that f[x] attains 
every real value y. By Theorem 5.6, f[x] can attain each value only once. This means that 
for every real y, there is an x = g[y] so that f[x] = y. In other words, we see abstractly that 
f[x] has an inverse without actually solving the equation y = x^ + x"^ + x^ + x^ + x for x as 
a function of y. 




The function y = Tan[x] has derivative ^ When — 7t/2 < x < tt/ 2, cosine is 

not zero and therefore the tangent is increasing for — 7t/2 < x < tt 12. How do we solve for 
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X in the equation y = Tan[x]? 

y = Tan[x] 

ArcTan[j/] = ArcTan[Tan[x]] = x 
X = ArcTan[y] 

But what is the arctangent? By definition, the inverse of tangent on (— 7 t/2, tt/2). So how 
would we compute it? The Inverse Function project answers this question. 

Exatnpl6 5.5. A Non- Elementary Inverse 

Some functions are do not have classical expressions for their inverses. Let 

y = f[x] = x^ 

This may be written using x = so x^ = = e® and f[x] has derivative 

^ _ d(e^ 
dx dx 

= (Log[x] +a;-)e“ 

= (l + Log[x])x" 

It is clear graphically that y = f[x] has an inverse on either the interval (0, 1/e) or 
(1/e, oo). We find where the derivative is positive, negative and zero as follows. First, 
j,x _ gK Log[a;] jg ^^^.^^ays positive, never zero, so 

0 = (1 + Log[a;]) 

0= (l + Log[x]) 

-1 = Log[a:] 

g-l ^ gLog[a:] 

1 

- = X 

e 

If cc < 1/e, say x = 1/e^, then ^ = (1 + Log[e“^])(+) = (1 - 2)(+) = (-) < 0. If 
X = e > 1/e, ^ = (1 + Log[e])(+) = (2)(+) = (+) >0. So ^ < 0 for 0 < a; < 1/e and 
^>0forl/e<x< oo. (Note our use of Darboux’s Theorem 7.2.) This means that 
f[x] = x^ has an inverse for x > 1/e. 

It turns out that the inverse function x = g[y\ can not be expressed in terms of any of the 
classical functions. In other words, there is no formula for g[y\. (This is similar to the non- 
elementary integrals in the Mathematica NoteBook Symbolicintegr. Computer algebra 
systems have a non-elementary function uj[x] which can be used to express the inverse.) The 
Inverse Function project has you compute the inverse approximately. 

Example 5.6. A Microscopic View of the Graph 

We view the graph x = g[y] for the inverse as the graph y = f[x] with the roles of the 
horizontal and vertical axes reversed. In other words, both functions have the same graph, 
but we view y as the input to the inverse function g[y]. A microscopic view of the graph 
can likewise be viewed as that of either y = f[x] or x = g[y]. 
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dy 




Figure 5.2: Small View of a: = g[y] and y = f[x] at {xo,yo) 

The ratio of a change in ^-output dx to a change in ^-input dy for the linear graph is 
the reciprocal of the change in /-output dy to the change in /-input dx for the function. 
In other words, if the inverse function really exists and is differentiable, we see from the 
microscopic view of the graph that we should have 



^ = /'[Xol and 



dx 

dy 



1 

dy/dx 



a'ivo] 



The picture is right, of course, and the Inverse Function Theorem 5.7 justifies the needed 
steps algebraically (in case you don’t trust the picture.) 

Example 5.7. The Symbolic Inverse Function Derivative 

Assume for the moment that f[x] and g[y] are smooth inverse functions. Apply the Chain 
Rule (in function notation) as follows. 



= 9[f[x]] 

^=gy[x]]-f[x] 
l = 9 '[f[x]]-f[x] 

At a point (xo,yo) on the graph of both functions, we have 



g'[yo] 



1 

f'[xo] 
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In differential notation, this reads like ordinary fractions, 



X = g[y] ^ y = /W 




X = ArcTan[j/] 

^ = ArcTandy] 
dy 

ArcTan'[t/] = 



y = Tan[x] 



^ ^ 

dx (Cos [a;]) 2 



dx 

dy 




(Cos[a;])^ 



Correct, but not in the form of a function of the same input variable y. We know that 
Tan^[x] = and Sin^[x] + Cos^[x] = 1, so we can express Cos^[x] in terms of y, 




So we can write 

ArcTan'[y] = (Cos[a;])^ = ^ ^ ^ 

The point of this concrete example is that we can compute the derivative of the arctangent 
even though we don’t have a way (yet) to compute the arctangent. In general, the derivative 
of an inverse function at a point is the reciprocal of the derivative of the function. In this 
case a trig trick lets us find a general expression in terms of y as well. 

Example 5.9. Another Inverse Derivative 

It is sometimes easier to compute the derivative of the inverse function and invert for the 
derivative of the function itself ~ even if it is possible to differentiate the inverse function. 
For example, if y = + 1 and x = ^Jy — 1 when y > 1, then ^ = 2x. The inverse function 
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rule says 



dx 

dy 

dx 



1 

2x 

1 



The point of the last two examples is that computing derivatives by reciprocals is some- 
times helpful. The next result justifies the method. 

Theorem 5.7. The Inverse Function Theorem 

Suppose y = f[x] is a real function that is smooth on an interval containing a real 
point xo where f'[xo] yf 0. Then 

(a) There is a smooth real function g[y] defined when \y — yo\ < A, for some 
real A > 0. 

(b) There is a real £ > 0 such that if \x — xq| < e, then \f[x] — yo\ < A and 
g[f[x]] = X. 

g[y] is a “local” inverse for f[x]. 



Proof; 

Suppose we have a function y = f[x] and know that f' [x] exists on an interval around a 
point X = xq and we know the values yo = f[xo] and m = f'[xo] yf 0. In a microscope we 
would see the graph 



dy 




Figure 5.3: Small View of y = f[x] at (xq, yo) 

The point (dx,dy) = (0,0) in local coordinates is at (xo,yo) in regular coordinates. 

Suppose we are given y near yo, y « yo- In the microscope, this appears on the dy axis 
at the local coordinate dy = y — yo- The corresponding dx value is easily computed by 
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inverting the linear approximation 



dy = mdx 
mdx = dy 
dx = dy/m 



The value xi that corresponds to dx = dy/m is dx = x\ — xq. Solve for x\, 



xi = xo + dx 
= xq + dy/m 
= xo + {y- yo)/m 

Does this value of x = xi satisfy y = f[xi] for the value of y we started with? We wouldn’t 
think it would be exact because we computed linearly and we only know the approximation 



f[xo + dx] = f[xo] + f[xo] dx + e-dx 
f[^i] = fl^o] + m ■ (xi - Xo) + e ■ (a;i - a;o) 



We know that the error e « 0 is small when dx « 0 is small, so we would have to move the 
microscope to see the error. Moving along the tangent line until we are centered over xi, 
we might see 



?y 




The graph ofy = f[x] appears to be the parallel line above the tangent, because we have 
only moved x a small amount and f'[x] is continuous by Theorem 5.4. We don’t know how 
to compute x = g[y] necessarily, but we do know how to compute yi = f[xi]. Suppose we 
have computed this and focus our microscope at (xi,j/i) seeing 
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dy 




Figure 5.5: Small View at {x\,y\) 



We still have the original y ~ yi and thus y still appears on the new view at dy = y — y\. 
The corresponding dx value is easily computed by inverting the linear approximation 

dy = mdx 
mdx = dy 
dx = dy/m 

The X value, X 2 , that corresponds to dx = dy/m is dx = X 2 ~ xi with X 2 unknown. Solve 
for the unknown, 

X 2 = Xi + dx 

= xi + dy/m 
= xi + {y- yi)/m 

This is the same computation we did above to go from xq to x\, in fact, this gives a discrete 
dynamical system of successive approximations 

xo = given 

Xn+l =Xn+{y - f[Xn])/m 

Xn+i=G[xn], with G[x\= x+ {y - f[x\)/m 

The sequence of approximations is given by the general function iteration explained in 
Chapter 20 of the main text, 

xi = G[xo], X 2 = G[xi] = G[G[xo\], x^i = G[G[G[xo]]], ■ ■ ■ 
and we want to know how far we move, 

lim Xn =? 

n— ^OO 

The iteration function G[x\ is smooth, in fact, 

G'[x\ = 1 - f[x]/m 
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and in particular, G'[xo] = 1 — f'[xo]/m = 1 — mfm = 0. By Theorem 5.4, whenever x « xq, 
G'[x] ~ 0. Differentiability of G[x] means that if Xi « Xj « xq, 

G[xi] — G[xj] = G'[xj] ■ {xi — Xj) + i ■ {xi — Xj) 

\G[xi] - G[xj] \ = \G'[xj] + i\ • |x, - Xj\ 

\G[xi] - G[xj]\ <r-\xi- Xj\ 

for any real positive r, since G'[xj] « G'[xq\ = 0 and some r « 0. 

If j/ « j/g, then x\ = xo — (y — yo)l'm « xg. Similarly, if x„ « xg, then /[x„] « y and 
Xn+i « Xg, SO |G[x„+i] - G[x„]| < T ■ |x„+i - x„ | . Heuce, 

\X2 -Xi\ = \G[xi] - G[xg]| < r|xi - Xg| 

\X 3 - X2\ = \G[x2] - G[xi]| < r|x2 - Xi| < r(r|xi - Xg|) = r^|xi - Xg| 

|X 4 - X 3 I = |G[x 3 ] - G[x 2 ]| < r|x 3 - X 2 I < r(r^|xi - Xg|) = r^|xi - Xg| 

and in general 

|x„+i - x„| < r”|xi - Xg| 

The total distance moved in x is estimated as follows. 

I^n+l ^0 I — |(^n+l Xji) (Xj 2 Xji—i) T (Xj^—i Xn— 2 ) “t“ * * * “t“ (xi ^o)| 

< |x„+i - X„| + |x„ - X„_i| + |x„_i - X„_2| H h |xi - Xg| 

< r”|xi - Xg| + r”“^|xi - Xg| + . . . + |xi - Xg| 

< (r” + + . . . + r + l)|xi - Xg| 

_ j-n+l 

< — ; \xi - Xg| 

1 — r 

The sum 1 + r + + • • • + r” = is a geometric series as studied in the main 

text Chapter 25. Since lim„^oo r”' = 0 if |r| < 1, 1 + r + + • • • + r” ^ — for 

1 — r 

|r| < 1. Thus, for any y ^ yo and any real r with 0 < r < 1, 

\Xn - Xo\ < |xi - Xg| 

1 — r 

for all n= 1, 2, 3, — 

Similar reasoning shows that when y ~ yo 

\xk+n -Xk\< |xi - Xg| 

1 — r 

because 

|xfc+l - Xfcl < r'^lxi - Xg| 

\xk+2 - Xk+i\ = |G[xfe+i] - G[xfc]| < r|xfe+i - Xk\ < r(r'"|xi - xg|) 

|xfc +3 - Xfc+2| = |G[xfe+2] - G[xfc+i]| < r|xfe+2 - Xfc+i| < r^(r''|xi - xg|) 
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Take the particular case r = 2. We have shown in particular that whenever 0 < <5 « 0 
and \y — yo\ < then for all k and n, 

2 

\Xn - Xq\ < 2\xi - Xq\ and \xk+n ~ Xk\ < -^\xi - Xq\ 

and f[x] is defined and \f'[x] — /'[xo]| < |m|/2 for \x — xo| < 3(5/|m|. By the Function 
Extension Axiom 2.1, there must be a positive real A such that if |y — yo| < A, then for all 
k and n, 

2 

|a;„ - a;o| < 2|a;i - xo| and \xk+n ~ Xk\ < -^\xi - xq\ 

and f[x] is defined and \f'[x] — /'[xq]] < |m|/2 for \x — a;o| < 3A/|m|. Fix this positive real 

A. 

Also, if |x — xqI < A « 0, then \f[x] — yo\ < A with A as above. By the Function 
Extension Axiom 2.1, there is a real positive £ so that if \x — xq\ < e < 3A/|m|, then 
\f[x] -yo\< A and \f[x] - f[xo]\ < \m\/2. 

Now, take any real y with \y — yo\ < A and consider the sequence 

xi = G[xo], X2 = G[xi] = G[G[xo]], x^i = G[G[G[xo]]], ■ ■ ■ 

This converges because once we have gotten to the approximation Xk, we never move beyond 
that approximation by more than 

2 

\xk+n - a:fc| < ^|xi - a;o| 

In other words, if we want an approximation good to one one millionth, we need only take 
k large enough so that 

^\xi - Xq\ < 10"® 

(1 - fc)Log[2] + Log[|a;i - xo|] < -61og[10] 

, . Log[2] + Log[|xi - a;o |] + 6 Log[10] 

L^5i2j 

This shows by an explicit error formula that the sequence Xn converges. We define a real 
function g[y] = lim„^oo Xn whenever \y — yo\ < A. We can approximate g[y] = Xao using 
the recursive sequence. (This is a variant of “Newton’s method” for uniform derivatives.) 

Moreover, if \y — yo\ < A, then f[xoo] is defined and |/'[xoo] — /^[a^oll < |w|/2 because 
la^oo - xo\ < 2\xi - xo\ = 2\y - yo\/\m\ < 3A/|m|. 

Consider the limit 



Xoo = lim x„ 

n—^oo 

= lim G[xn-i] 

n — >-oo 

= G[ lim Xn-i] 

n — >oo 

= G[Xoo] 

Xoo = Xoo + {y - f[Xoo\)/'m 
0 = (y - f[xoo\)/m 

y = f[xoo] 
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SO g[y\ = Xoo is the value of the inverse function and proves that the inverse function exists 
in our real interval [xq — e,xo + e]. 

We conclude by showing that g[y] is differentiable. Take yi « t /2 in the interval (j/o ~ 
A, 2/0 + ^), not near the endpoints. We know that f[xi] is defined and |/'[xi]| > |m|/2 where 
Xi = g[yi] for i = 1,2. 

We have X 2 ~ x\ because the defining sequences stay close. Let xn = xq + {yi — f[xQ])/m 
and + {yt + /[xi„])/m, for i = 1,2. Then |x 2 i - xn\ = \y 2 - yi\/\m\ « 0 

and since \xi — xq\ < 3A/|m|, f[x] is differentiable at Xm- We can show inductively that 
\^ 2 {n+i) ~ a^i(n+i)l < B ■ I I for some finite multiplier B when n is finite, because 



X2{n+1) ^l(n+l) (X2n ^In) 

= (X 2 n - Xin) + 

= ^i^+(l 



f /2 - yi f[x2n] - f[xin] 
m m 

J /2 - 2/1 , {f'[xin] + i) • {X2n ~ Xin) 



m 

f[xin] + t 



m 

‘ {X2n Xin) 



For any real positive 0, choose k large enough so that \xi — Xik\ < -^{xn — a;o| < 0/3. Then 
\x 2 — xi\ < \x 2 — X 2 k \ + \x\ — xik \ + |ai 2 fe ~ X\\k\ < 0. Since 0 is arbitrary, we must have 

X2 ^ Xi. 

Differentiability of f[x] means that 

f[x2] - f[xi] = f'[xi]{x2 -Xi) + t- {X2 - Xi) 
with 6 « 0. Solving for {x 2 — a^i) gives 

/ X 2/2 - 2/1 

{X2 -xi) = 

I [xi\ + i 

Since \f'[xi]\ > \m\/2, + V with 77 « 0. Hence, 

1 



g[v2] -g[yi] = 



f'[xi] 



(2/2 -yi) + v (2/2 - 2/1) 



and g[y] is differentiable with g'[yi] = 1/ f[xi]. 
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Pointwise Derivatives 



This chapter explores the pathological consequences of weakening 
the definition of the derivative to only requiring the limit of differ- 
ence quotients to converge pointwise. 



Could a function have a derivative of 1 at a: = a and not be increasing near x = al 
Could we have F'[x] = f[x] on [a,b] and yet not have f[x] dx = F[b] — F[a]l The 
strange answer is “yes” if you weaken the definition of derivative to allow only “pointwise” 
derivatives. We chose Definition 5.2 because ordinary functions given by formulas do not 
exhibit these pathologies, as shown in Theorem 5.5. We make the theory simple and natural 
with Definition 5.2 and lose nothing more than strange curiosities. If f[x] is smooth on an 
interval around x = a and f'[a] = 1, then f[x] is increasing. (See Theorem 5.6.) The 
Fundamental Theorem of Integral Calculus above holds for smooth functions, as our proof 
shows, whereas a pointwise derivative need not even be integrable. 

It is an unfortunate custom in many calculus texts to use the pointwise derivative. (As Pe- 
ter Lax said in his lecture at a conference in San Jose, ‘No self-respecting analyst would study 
the class of only pointwise differentiable functions.’) This chapter explores the pathologies 
of the pointwise derivative and concludes with the connection with Definition 5.2 in Theo- 
rem 7.4: If pointwise derivatives are continuous, they satisfy Definition 5.2. The contrast 
of the straightforward proofs by approximation in this book with the round-about proofs of 
things like the Fundamental Theorem in many “traditional” books is then clear. The Mean 
Value Theorem 7.1 is used to make an approximation uniform. The traditional approach 
obscures the approximation concepts, makes the Mean Value Theorem seem more central 
than it actually is, and contributes no interesting new theorems other than the Mean Value 
Theorem whose main role is to recover from the over-generalization using Theorem 7.4. 



6.1 Pointwise Limits 



This section reviews the idea of a limit both from the point of view of “ep- 
silons and deltas” and infinitesimals. 



Suppose (/[Aa;] is a function that only depends on one real variable, Aa;, and is defined 
when 0 < |Aa;| < b. (The function may or may not be defined when Aa; = 0.) Let go be a 
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number. The intuitive meaning of 

lim g[Ax] = go 
Ate— tO 

is that g[5x\ is close to the value go, when 5x is small or close to zero, but not necessarily equal 
to zero. (We exclude 5x = 0, because we often have an expression which is not defined at the 
limiting value, such as (?[Ax] = Sin [Ax] /Ax and want to know that limAx^o Sin [Ax] /Ax = 
1.) Technically, the limit is go if the natural extension function satisfies g[5x\ « go, whenever 
0 yf Jx « 0. 

We proved the following in Theorem 3.2. 

Theorem 6.1. Let g[Ax] he a function of one real variable defined for 0 < |Ax| < b 

and let go be a number. Then the following are equivalent definitions of 

lim 5 [Ax] = go 

(a) For every nonzero infinitesimal Sx, the natural extension function satisfies 

g[Sx] « go 

(b) For every real positive error tolerance, e, there is a corresponding input 
allowance D[e], such that whenever the real value satisfies 0 < |Ax| < D[s], 
then 

|5[Ax] - go\<£ 



Example 6.1. limAa:^oSin[7r/Ax] does not exist 



We want to see why a limit need not exist in the case 

g[Ax] = Sin[^] 



Notice that (?[Ax] is defined for all real Ax except Ax = 0. The fact that it is not defined 
is not the reason that there is no limit. We will show below that 

lim Ax Sin[-^] = 0 
Ax^O ^Ax 

even though this second function is also undefined at Ax = 0. 

We know from the 27 t periodicity of sine that 

Sin[0] = +1 if 0 = 2kTT + — and Sin[0] = — 1 if 9 = 2kn — — 

whenever /c = ±1, ±2, • • • is an integer. Hence we see that 



g[Axi] = Sin[£-] 
g[Ax2] = Sin[£-] 



= +1 


if 


Ax 


= -1 


if 


Ax[ 



1 

2k +\ 
1 



Intuitive Reason for No Limit 

We can take k as large as we please, making Axi and Ax 2 both close to zero, yet 
^[Axi] = +1 and g[Ax 2 ] = —1, so there is no single limiting value go- 
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II 

Figure 6.1; y = Sin[^] 

Rigorous Infinitesimal Reason for No Limit 

We want an infinite integer K, so we can let 5x\ = 1/{2K + \) and 8x2 = 1/(2 Ri — i) 
and have 5x\ « 8x2 ~ 0 , g[8xi] = +1 and §[8x2] = —I- We can rigorously define infinite 
integers by function extension. 

Let fV[x] be the function that indicates whether or not a number is an integer, 

0, if cc is not an integer 

1, if a; is an integer 

Then we know that if N[fc] = 1 , then Sin[( 2 RT ± \)tt\ = ±1 (respectively). We also know 
that every real number is within 1 unit of an integer; for every x, there is a /c with |a; — /c| < ^ 
and N[fc] = 1 . The natural extension of N[x] also has these properties, so given any infinite 
hyperreal i 7 , there is another RT satisfying I iJ—RT I < N\K] = land Sin[(2RT±i)7r] = ± 1 . 
This technically completes the intuitive argument above, since we have two infinitesimals 

8x1 = Y 8x2 = f with g[8x\] a distance of 2 units from g[8x2]- 

2^+2 2RT - 2 

Rigorous s - D Reason for No Limit 

No limit means that the negation of the e — D statement holds for every real value 
(/o- Negation of quantifiers is tricky, but the correct negation is that for every g^, there is 
some real e > 0 so that for every real D > 0 there is a real Ax such that |Ax| < D and 
\g[Ax] -go\>e. 

Let £ = 3 and let go and D > 0 he arbitrary real numbers. We know that we can take 
an integer k large enough so that 0 < |Axi| < |Ax2| < D, g[Ax\] = +1 and 5[Ax2] = — 1 . 
At most one of the two values can be within 1 of go, because if |(;[Axi] — (/o| < then 
|(7 [Ax 2] — ffol a I or vice versa. This shows that the negation of the e — D statement holds. 

Exatnpl6 6 . 2 . limA^^o Ax Sin[7r/Ax] = 0 

Since | Sin[ 0 ]| < 1 , |Ax Sin [anything] | < |Ax|, so if |Ax| is small, |Ax Sin[7r/Ax]| is small. 
The rigorous justification with infinitesimals is obvious and the rigorous e — D argument 
follows simply by taking D[e] = e. 
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[ Exercise set 6.1 1 

1. Show that liniAx^o Sin[7r/Aa;] = 0. 

2. Show that liniAx^o ^Cos[7r/Aa;^] does not exist. 



6.2 Pointwise Derivatives 



What happens if we apply the pointwise limit idea to g[Aa;] = {f[x + Ax] — 
f[x])/Ax, “holding x fixed”? In fact, many hooks use this to define the 
following weak notion of derivative. 



Definition 6.2. Pointwise Derivative 

We say that the function f[x] has pointwise derivative Dxf[xQ\ at a point xq if 



/[xq + Ax] - f[xo] 
Ax^O Ax 



Dxf[xo] 



What is the difference between this definition and Definition 5.2? We can explain this 
either in terms of the e — D definition, or in terms of infinitesimals. In terms of e — D limits, 
the input allowance D[s] can depend on the point xq in the pointwise definition. In the 
following example, f[x] = x^ Sin[7r/x], a D[e] that works at x = 0 does not work at x = -y/e. 

In terms of infinitesimals, the increment approximation 



/[xo + Sx] - f[xo] = f'[xo] -Sx + e- Sx 
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only holds at fixed real values. In the following example, this approximation fails at hyperreal 
values like x = 1/{2K). 

Before we proceed with the example, we repeat an important observation of Theorem 5.5. 
Derivatives computed by rules automatically satisfy the stronger approximation, provided 
the formulas are valid on intervals. If you compute derivatives by rules, you know that you 
will see straight lines in microscopic views of the graph. The next example shows that this 
is false for the weaker approximation of pointwise derivatives. 

Example 6.3. A Non-Smooth, Pointwise Differentiable Function 
From the exercise above, the following definition makes f[x] a continuous function. 



/W = 1 2 



0, if a; = 0 

Sin[^], ifxffO 



That is, lima;^o f[x] = f[0] = 0. Differentiation of f[x] by rules does not apply at x = 0, 
since we obtain 

f[x] = 2 X Sin[— ] — 7T Cos[— ] 

which is undefined at x = 0. 

We can apply the pointwise definition of derivative to this function at xq = 0, 

/[xq + Ax] - f[xp] ^ /[0 + Ax] - /[O] 

Ax^O Ax Ax^O Ax 



= lim 

Alc— 



Ax^ Sin[^] - 0 
Ax 



= lim Ax Sin[— — ] = 0 
Ax^O Ax 

So the pointwise derivative at x = 0 isDa,/[0] = 0. 

If we focus a microscope at (x, y) = (0, 0) and magnify enough, we will see only a 




b) 



c) 









Three Views of y = x^ bin[^j 

However, if we first magnify this much and then move the focus point to 



X = l/\/ magnification 



we will no longer see a straight line, but rather a pure sinusoid! 
Example 6.4. Why we see a sinusoid 
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Suppose we focus our microscope at the point x = 1/{2K), for a very large K. We know 
that Sin[^] = ±1 for x = 1/(2 ± i). This makes the function values Sin[^] differ by 



1 



-1 



1 



(2X-i)2 (2if+i)2 {2Ky 

so that we see 2 after magnification by {2K)‘^, 

1 



(1-4^)^ 



1 



(1-4^)^ (1+4X)^ 

We compute the distance between these x points, 






1 1 2K+^-2K+^ 

2K-\ ~ 2K+^ ~ {2K - \){2K +\) 

1 

"4i^2_i 

1 1 

- {2KY I- ^ 

These points are one unit apart on the 1/(2 scale, 

1 

~\~1 i ^ ^ 

^ l&K'^ 

We will see a difference of two units in function values at magnification {2 KY and the 
differing points lie one unit apart at this magnification. 

We can say more. If we magnify by 4itT^ and observe the function f[x + 5x\ with the 
microscope centered at (x,0) = (1/(2 iti), 0), we see the magnified values 

AK‘^(x + 5xY — ], X fixed, 5x varying. 

x + ox 

But we also see magnified values on the dx axis. Let 6x = dx/(4iL^), for dx finite and let 

F[dx] = 4 K‘^{x + SxY Sin [ — ^ ^ ] 

with this relationship between the true Sx and the dx we see in the microscope. Our 
microscopic view is the same as F[dx] at unit scale. The coefficient in front of the sine 
above is actually constant on the scale of our microscope. 



4K^{x + 6xY = 



2K ' 4Lf2 

,2K dx2K^.^ 

= ( 1 ) 

^2K 4iL2 



= (1 



dx 



1 
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for dx finite, so F[dx\ « Sin[ ^^^^ ] on this scale. By algebra (as in Chapters 5 and 28 of the 
main text) 



1 

X + 6x 



1 



—Sx 



X x{x + Sx) 



2K- 

2K- 



dx 



dx 



1 J_ 

2K 



2K-dx 



dx 

(2X)2 



This means that Sin[7r/ (x + fe)] « Sin[2 K TT — Trdx] = — Sin[7T(ia:]. At the point x = 1/(2 X) 
with magnification {2K)‘^, we see the function 

dy = — Sin[7T dx] 



dy 



dx 



Figure 6.3: y = f[x] at a; = 1/(2 AT) 




Example 6.5. More Trouble with Pointwise Derivatives 

The sinusoidal view we see in the microscope is just a hint of what can go wrong with 
derivatives that are only given by pointwise limits. A pointwise derivative can be 1 and yet 
the function need not be increasing near the point. The Fundamental Theorem of Integral 
Calculus is false if we only assume that D^F^x) = f[x], because then f[x] dx need not 
exist. The section below on the Mean Value Theorem unravels the mystery. A pointwise 
derivative Dxf[x] is a continuous function on an interval if and only if it is actually an 
ordinary derivative, Dxf[x] = f[x]. 



[ Exercise set 6.2 1 

1. Show that lima;^o (2 a; Sin[^] — tt Cos[y]) does not exist. 
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2. Use Mathematica to Plot y = from -0.0001 to + 0.0001. Use AspectRatio -> 

1 and PlotRange -> {-0.0001,0.0001} . You should see a straight line, but if you do not 
control the PlotRange, you won’t. (Try the plot without setting equal scales.) 

Now move the focus point of your microscope to x = 0.01. Plot from 0.0099 to 0.0101 
with PlotRange -> {-0.0001,0.0001} . You will see a sinusoid. (If you use equal scales.) 

3. Show that the function f[x] = x Sin[^] is continuous if we extend its definition to 
f[0] = 0. Show that the extended function does not even have a pointwise derivative at 
a; = 0. What do you see if you Plot this function at a very small scale at zero? 

Show that the function f[x] = x^ Sin[^] is continuous if we extend its definition to 
f[0] = 0. Show that the extended function has a pointwise derivative at x = 0. What 
do you see if you Plot this function at a very small scale near zero? Do wiggles appear? 




6.3 Pointwise Derivatives Aren’t Enough 
for Inverses 



A function can have a pointwise derivative at every point, Dfx[xo] = 1, but 
not be increasing in any neighborhood of xq. 



The function 



w[a;] 



0, if a: = 0 

a; + a;^ ]> if a; ^ 0 



has pointwise derivative D,cw[0] = 1. However, this function does not have an inverse in 
any neighborhood of zero. It is NOT increasing in any neighborhood of zero. You can verify 
this yourself. Here are plots of w[a:] on two scales: 






Pointwise Derivatives Aren’t Enough for Inverses 



77 



y = w[a:] = x + 




-1 < a; < 1 and 0.0099 <x< 0.0101 




1 . (a) Show that the function w[a;] above has an ordinary derivative at every x 0. 

(b) Show that the function above has a pointwise derivative at every x, in par- 

ticular, the pointwise derivative Dxw[0] = 1. (HINT: Write the definition and 
estimate.) 

(c) Verify the plots shown using Mathematica with equal scales. One plot is from 
-1 to 1 and the other is from 0.0099 to 0.0101 both with AspectRatio -> 1 and 
PlotRanges equal to x ranges. 

(d) Prove that for every real 9 > Q, there are numbers xi < X 2 with w[xi] > w[x 2 ] 
as shown on the decreasing portion of the small scale graph above. 
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CHAPTER I 

The Mean Value Theorem 

The following situation illustrates the main result of this chapter. 




You travel a total distance of 100 miles in an elapsed time of 2 hours for an average or 
“mean” speed of 50 mph. However, your speed varies. For example, you start from rest, 
drive through city streets, stop at stop signs, then enter the Interstate and travel most of 
the way at 65 mph. Were you ever going exactly 50 mph? Of course, but how can we show 
this mathematically? 



[ Exercise set 7.0 1 

1 . Sketch the graph of a trip beginning at 2 pm, 35 miles from a reference point, ending at 
4 pm 135 miles from the point and having the features of stopping at stop signs, etc., 
as described above. 

Sketch the line connecting the end points of the graph (a, f[a\) and {b,f[b]). What is 
the slope of this line ? 

Find a point on your sketch where the speed is 50 mph and sketch the tangent line at 
that point. Call the point c. Why does this satisfy f'[c] = 



7.1 The Mean Value Theorem 



The Mean Value Theorem asserts that there is a place where the value of 
the instantaneous speed equals the average speed. This theorem is true even 
if the derivative is only defined pointwise. 



We want to formulate the speed problem above in a general way for a function y = f[x] 
on an interval [a, b]. You may think of x as the time variable with x = a a,t the start of the 
trip and x = b a,t the end. The elapsed time traveled is & — a, or 2 hours in the example. 
(Perhaps you start at 2 and end at 4, 4 - 2 = 2.) You may think of y = f[x] as a distance 
from a reference point, so we start at f[a], end at f[b] and travel a total of f[b] — f [a]. The 
average speed is {f[b] — f [a])/ {b — a). 
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We state the Mean Value Theorem in its ultimate generality, only assuming weakly 
approximating pointwise derivatives and those only at interior points. This complicates the 
proof, but will be the key to seeing why regular derivatives and pointwise derivatives are 
the same when the pointwise derivative is continuous. 



Theorem 7.1. The Mean Value for Pointwise Derivatives 

Let f[x] be a function which is pointwise differentiable at every point of the open 
interval (a, b) and is continuous on the closed interval [a, b] . There is a point c in 
the open interval a < c < b such that 



Cx/[c] 



f{b] - f[a] 

b — a 




Figure 7.1: Mean Slope and Tangents 



There may be more than one point where Dxf[c] equals the mean speed or slope. 
Proof 

The average speed over a sub-interval of length Ax is 



f[x + Ax]-f[x] 

9m = TT 



and this new function is defined and continuous on [a, b — Ax]. 

Suppose we let Axi = {b — a)/ 3 compute the average of 3 averages, the speeds on 
[a, a + Axi], [a -I- Axi,a + 2Axi] and [a + 2Axi, a -I- 3Axij. This ought to be the same as 
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the overall average and the telescoping sum below shows that it is: 

^ (ffN + 5[a + Axi] + g[a + 2Axi]) = 

1 / /[g + Aa:i] - f[a] f[a + 2Axi] - f[a + Axi] f[a + 3Aa:i] - f[a + 2Axi] \ 

3 \ Axi Axi Axi / 

f[a + Axi] — f[a] + f[a + 2Ax\] — f[a + Axi] + f[a + SAxi] — f[a + 2Ax\] 

iAxi 

^ ~/M + m ^ /[^] - /M 

3^ h-a 

This implies that there is an adjacent pair of sub-intervals with 

1 f[^io + Axi] - f[xio] ^ f[b] - f[a] ^ f[xM + Axi] - f[xht] 

Axi ^ b- a - Axi 

because the average of the three sub-interval speeds equals the overall average and so either 
all three also equal the overall average, or one is below and another is above the mean slope. 
(We know that xio and Xhi differ by Axi, but we do not care in which order they occur 
xio < Xhi or Xhi < xio.) 

Since g[x] is continuous, Bolzano’s Intermediate Value Theorem 4.5 says that there is an 
xi between xio and x/ii with g[xi] = (/[xi -I- Axi] — /[xi])/Axi = {f[b]—f[a])/{b — a). The 
subinterval [xi, xi -I- Axi] lies inside (a, b), has length (b— a)/3 and f[x] has the same mean 
slope over the subinterval as over the whole interval. (So far we have only used continuity 
of /[x].) 

Let Ax 2 = {b — a)/3^, one third of the length of [xi,xi -I- Axi]. We can repeat the 
average of averages procedure above on the interval [xi,xi -I- Axi] and obtain a new sub- 
interval [x 2 ,X 2 -I- Ax 2 ] inside the old sub-interval such that (/[x 2 -I- AX 2 ] — /[x 2 ])/Ax 2 = 
{m-f[a])/(b-a). 

Continuing recursively, we can find x„ in (x„_i,x„_i -I- Ax„_i) with Ax„ = (6 — a)/3” 
and (/[x„ -k Ax„] - /[x„])/Ax„ = (/[&] - f[a])/{b-a). 

The sequence of numbers a„ = x„ increases to a limit c in (a,b), and the sequence 
bn = Xn + Ax„ decreases to c. In addition, we have 



f[b] - f[a] ^ fjxn + Ax„] - f[Xn] 
b — a Axn 

^ f[bn\ - /[g«] ^ f[bn\ - f[c] + /[c] - f[an] 
bn O-n bn Cln 

_ bn-C f[bn] - f[c] C-ttn f[c] ~ /[On] 
bn O-n bn C bn Un C (Xn 



Notice that coefficients are positive and satisfy 



bn ' 



bn 0>n bn dn 



= I 



Also notice that 



/[t-i - m , ,i„ sMAhi 






n^oo C — ttn 



= D:^f[c] 
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Hence 

bn-C f[bn]-f[c] ^ C-an /[c]-/K] ^ 

n >oo bn Cln bn C bn dn C (Xn 

and we have proved the general result of the graphically ‘obvious’ Mean Value Theorem, 
by finding a sequence of shorter and shorter sub-intervals with the same mean slope and 
‘taking the limit.’ 



[ Exercise set 7.1 1 

1 . Suppose f[x] satisfies Definition 5.2. Show that the step in the proof of the Mean Value 
Theorem where we write 

bn-C f[bn] - f[c] ^ C-Qn f[c] ~ ffin] 
bn bn C bn Un C On 

can be skipped. If we take an infinite n, we must automatically have 

~ « f[an] « ribn] « /' [c] 



when f[x] satisfies Definition 5.2. Why? 

2 . This exercise seeks to explain why we call the fraction 



f[b] - f[a] 

b — a 



the average speed in the case of the ordinary derivative, Definition 5.2. 
The average of a continuous function g[x] over the interval [a, b] is 



1 

b — a 




g[x] dx 



If f[x] satisfies Definition 5.2, show that the average of the speed is 



1 

b — a 



f'[x] dx 



f{b] - f[a] 

b — a 



What theorem do you use to make this general calculation? Why do you need Defini- 
tion 5.2 rather than only a pointwise derivative? 

Write an approximating sum for the integral and substitute the microscope approxima- 
tion f'[x] Sx = f[x-\- 5x] — f[x] — e 5x as the summand. The latter sum telescopes to 
f[b] ~ f[o] with your adjusting constants. 

Write the average of small interval speeds, {f[x-\- 5x] — f[x])/{Sx) for enough terms to 
move from a to b. How many terms are there in the sum? Why is this sum 



1 

(6 — a)/Sx 



b—5x 

E 

x—a 
step Sx 



f[x-\-6x] - f[x] 
Sx 



approximately the integral above? 
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3 . Alternate Proof to Averaging Averages 

Let f[x] satisfy the hypotheses of the Mean Value Theorem for Pointwise Derivatives. 
Let the constant m denote the mean slope, 

b — a 

Define a function 

h[x] = f[x] — m • {x — a) 

Show that h[x] has the following properties: 

(a) h[x] is continuous on [a,b], 

(b) h[x] is pointwise differentiable on (a,b). 

(c) h[a] = f[a] 

(d) h[b] = f[a] = h[a], so that the mean slope of h[x] is zero, 

h[b] - h[a] ^ 
b — a 

(e) For any x, we have Dxh[x] = 0 Dxf[x] = m 

The function h[x] has a horizontal mean cord. We want you to show that there 
is a point c in (a, b) where Dxh[c\ = 0. 

(f) Show that h[x] satisfies the hypotheses of the Extreme Value Theorem 4.4 on 
[a,b], hence must have both a max and a min. 

(g) Show that either h[x] is constant or not both the max and min occur at endpoints. 
In other words, there is a c in the open interval (a, b) where either h[c] is a max 
or min for [a,b\. 

(h) Prove a pointwise version of the the Interior Critical Points Theorem 10.2 from 
the main text and show that Dxh[c] = 0. 

(i) Show that Doof[c] = {f[b] - f[a])/{b - a). 



7.2 Darboux’s Theorem 



Suppose that f[x] is pointwise differentiable, but Dxf[x] is not necessarily 
continuous. The derivative function still has the intermediate value prop- 
erty. In other words, a derivative cannot be defined and take a jump in val- 
ues. (Pointwise derivatives can oscillate to a discontinuity, he defined, and 
NOT be continuous. Ordinary derivatives are continuous by Theorem 5.4) 



How do we know that it is sufficient to just check one point between the zeros of f'[x] 
in the graphing procedure of the main text Chapter 9? If f'[x] is not zero in an interval 
a < X < b and if f'[x] cannot change sign without being zero, then the sign of any one point 
determines the sign of all the others in the interval. Derivatives have the property that they 
cannot change sign without being zero, but not every function has this property. 
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7. The Mean Value Theorem 



It was 5°C when I woke up this morning, but has warmed up to a comfortable 16° 
now (61° F). Was it ever 10° this morning? Most people would say, ‘Yes.’ They implicitly 
reason that temperature ‘moves continuously’ through values and hence hits all intermediate 
values. This idea is a precise mathematical theorem and its most difficult part is in correctly 
formulating what we mean by ‘continuous’ function. See Theorem 4.5. 

Darboux’s Theorem even holds for the discontinuous weak pointwise derivatives defined 
above. We showed in Theorem 5.4 that our definition of f'[x] makes it a continuous function. 
This means we can apply Bolzano’s Theorem to f'[x] to prove the case of Darboux’s theorem 
for the ordinary derivatives we have defined. This is the result: 

Theorem 7.2. Darboux’s Intermediate Value Theorem 

If f[x] exists on the interval a < x <b, then f'[x] attains every value intermediate 
between the values f'[a] and f'[b]. In particular, if f' [a] < 0 and f'[b] > 0, then 
there is an xq, a < xq < b, such that f'[xo] = 0. 



Theorem 7.3. Intermediate Values for Pointwise Derivatives 

Suppose that f[x] is pointwise differentiable at every point of [a,b]. Then the 
derivative function Dxf[x] attains every value between Dxf[a] and Dxf\b], even 
though it can be discontinuous. 

Proof: 

The functions 






Dxf[a], 

f\x]-f\a] 



if X = a 
if x yf a 



and h[x] 



' fib] -fix] 

olfh 



if X yf & 

if X = b 



are continuous on [a, 6]. The function g[x] attains every value between Dxf[a\ and [f[b] — 
/[a]]/[& — a], while h[x] attains every value between [f[b] — /[a]]/[& — a] and Dxf[b]. Con- 
sequently, one or the other attains every value between Dxf[a] and Dxf[b] by Bolzano’s 
Intermediate Value Theorem 4.5. In either case, an intermediate value v satisfies 

^ ^ f[P] - f[a] 

P — a 

so the Mean Value Theorem for Derivatives above asserts that there is a 7 with D^fl'y] = v. 
This proves the theorem. 



[ Exercise set 7.2 1 

1. Show that the function y = j[x\ = equals —1 when x = —2, equals -1-1 when 

X = -1-3, but never takes the value y = ^ for any value of x. Why doesn’t j\x] violate 
Bolzano’s Theorem 4.5? 

2 . 1) Show that the function y = k\x] = vx^ + 2x + l has k'[x] = —1 when x = —2, has 
k'[x] = -1-1 when x = -1-3, but k'[x] never takes the value y' = \ for any value of x. 
Why doesn’t k[x] violate Darboux’s Theorem above? 

2) In the graphing procedures using the first and second derivatives, you must compute 
all values where the derivative is zero or fails to exist. Why is this a crucial part of 
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making the shape table? In particular, suppose you missed one x value where f[x] failed 
to exist or was zero. How could this lead you to make an incorrect graph? 



7.3 Continuous Pointwise Derivatives 
are Uniform 



Pointwise derivatives are peculiar because they do not arise from computa- 
tions with rules of calculus. This section explores the question of when a 
pointwise derivative is actually the stronger uniform kind. The answer is 
simple. 



Theorem 7.4. Continuous Pointwise Derivatives are Uniform 

Let f[x] be defined on the open interval (a,b). The following are equivalent: 

(a) The function f[x] is smooth with derivative f'[x] on (a,b) as defined Defi- 
nition 5.2. 

(b) The pointwise derivative exists at every point of (o, b) and defines a contin- 
uous function, Dxf[x] = g[x]. 

(c) The double limit 

li„ - An] ^ 

X1^X,X2^X X2 — X\ 

exists at every point x in (a,b). 



Proof 

(1) (3) : Assume that (1) holds. Let X 2 « « xg, a real value in (a,b). Then 

X 2 = xi -f- 5x with 5x = X 2 — x\ ~ Q and xi is not infinitely near the ends of the interval 
(a, 6). By smoothness at x\, 



f[x 2 ] = f[xi] + f'[xi] ■ (X 2 - Xi) + £ • {X 2 ~ Xi) 



SO [f[x2\ — /[xi]]/[a :2 — a^i] ~ f'[xi\. We know from Theorem 5.39 of the main text that 
f'[x] is continuous, so f'[xi] « f'[xg] and we have shown that for any real value xq in (a, b) 
and any pair of nearby values, 

X2 — Xi 

which is equivalent to (3). 

(3) ^ (2): Now assume (3). As a special case of the double limit, we may let xi = xg 
and take lim.x2^xo[f[x2] — /[a^o]]/[a ^2 — a^o] = h[xg] = Dxf[xg], showing that the pointwise 
derivative exists. It remains to show that h[x] = D,cf[x] is continuous. 

If xi « xg, we need to show that h[xi] « h[xg\. We may apply the Function Extension 
Axiom to show that given an infinitesimal e, for ‘sufficiently small’ differences between X 2 
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7. The Mean Value Theorem 



and x\, 



f[x2] - f[xi] 

X2 — Xi 



= D^f[xi] +£i 



with |£i| < £. We know that Dxf[x] = h[x] at all real points, hence by Extension, at all 
hyperreal points and h[xi] « [f[x 2 ] — /[a^i]]/[a ^2 — a^i]- 
The double limit means that whenever a ;2 ~ a^i ~ xq, 

X2 — Xi 

Hence, h[xi] ~ h[xQ] and h[x] = Dxf[x] is continuous. 

(2) (1): Finally, assume that the pointwise derivative exists at every point of (a, 6) 

and that g[x] = Dxf[x\ is continuous. This means that for any real xq in (a, b) and xi « xq, 
we have g[xo\ « We must show that for any finite x in (a, 6), not infinitely near an 

endpoint, and any infinitesimal 5x, 

f[x + 5x] = f[x] + D^f[x] ■ Sx + e ■ Sx 

for an infinitesimal e. 

By the Mean Value Theorem on [a;, x + fe], there is an x\ in {x, x + 6x) such that 

By the continuity hypothesis, since x « xi, we have Dxf[x] « Dxf[xi], so Dxf[x] = 
Dxf[xi] + £, with £ « 0. This means 

= dm + E 

so by algebra we have shown (1) with f'[x] = Dxf[x]: 

f[x + 5x] = f[x] + Dxf[x] ■ Sx + e ■ Sx 



= Dxf[x] + £ 



We have shown (1) (3) (2) (1), so all three conditions are equivalent. 
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Higher Order Derivatives 



This chapter relates behavior of a function to its successive deriva- 
tives. 



The derivative is a function; its derivative is the second derivative of the original function. 
Taylor’s formula is a more accurate local formula than the “microscope approximation” 
based on a number of derivatives. It has versions for each order of derivative and has many 
uses. Taylor’s formula of order n is equivalent to having n successive derivatives. 

Higher order derivatives can also be defined directly in terms of local properties of a 
function. The first derivative arises from a local fit by a linear function. We can successfully 
fit a quadratic locally if and only if the function has two derivatives. We give a general 
result for nth order fit and n derivatives. 



8.1 Taylor’s Formula and Bending 



The first derivative tells the slope of a graph and the second derivative 
says which way the graph bends. This section gives algebraic forms of the 
graphical “smile” and “frown” icons that say which way a graph bends. 



When the second derivative is positive, a curve bends upward like part of a smile. When 
the second derivative is negative, the curve bends downward like part of a frown. The smile 
and frown icons are based on a simple intuitive mathematical idea: when the slope of the 
tangent increases, the curve bends up. We have two questions. (1) How can we formulate 
bending symbolically? (2) How do we prove that the formulation is true? First things first. 

If a curve bends up, it lies above its tangent line. Draw the picture. The tangent line at 
xo has the formula y = b-\-m{x — xq) with b = f[xo] and m = f'[xo]. If the graph lies above 
the tangent, f[xi] should be greater than b + m{xi — xg) = f[xg] + f'[xo]{xi — xq) or 

f[xi] > f[xo] + f'[xo]{xi - a;o) 

This is the answer to question (1), but now we are faced with question (2): How do we 
prove it? The increment approximation says 

f[xi] = f[xo] + f[xo]{xi - a;o) + e{xi - xq) 
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8. Higher Order Derivatives 



SO this direct formulation of ‘bending up’ requires that we show that the whole error e{xi — 
xq) stays positive for xi ^ xg- All we have to work with is the increment approximation 
for f'[x] and the fact that f'[xo] > 0. A direct proof is not very easy to give - at least we 
don’t know a simple one. 

We have formulated the result as follows. 

Theorem 8.1. Local Bending 

Suppose the function f[x] is twice differentiable on the real interval a < x < b and 
Xq is a real point in this interval. 

(a) If f"[xg] > 0, then there is a real interval [a,/?], a<a<Xg<j3<b, such 
that y = f[x] lies above its tangent over [a,0\, that is, 

a<xi< (3 ^ f[xi] > f[xg] + f'[xo]{xi - Xo) 

(b) If f"[xo] < 0, then there is a real interval [a,/?], a<a<Xg<j3<b, such 
that y = f[x] lies below its tangent over [a, (3], that is, 

a<xi< (3 ^ f[xi] < f[xg] + f'[xo]{xi - Xo) 



Proof; 

This algebraic formulation of convexity (or bending) follows easily from the second order 
Taylor formula. This formula approximates by a quadratic function in the change variable 
Sx (where x is considered fixed), not just a linear function in Sx. A general higher order 
Taylor Formula is proved later in this chapter. We want to use the second order case as 
follows to show the algebraic form of the smile icon. 

Theorem 8.2. The Second Order Taylor Small Oh Formula 

If f[x] is twice differentiable on a real interval (a,b), a < x < b and x is not 
infinitely near a or b, then for any infinitesimal 6x 

f[x + 6x] = f[x] + f[x] Sx + i/"[x]((5x)^ + e-Sx'^ 

with £ « 0. 

Suppose that f"[xo] > 0 at the real value xg- If xi « xg, substitute x = xg and 
Sx = x\ — Xg into Taylor’s Second Order Formula to show: 

f[x + Sx] = f[x] + f[x] Sx + i/"[x](fe)^ + e-Sx'^ 

f[xi] = f[xg] + f[xg] {xi - Xg) + ^f"[xg] {xi - Xg) + E {xi - Xg)'^ 

The infinitesimal smile formula follows from using the fact that ^(/"[xq] + e)(xi — xg)^ > 0 

f[xi\ > f[xg] + f[xg] {Xi - Xg) 

The Function Extension Axiom 2.1 says that since f[xi] > f[xg] + f[xg] {x\ — xg) for 
all \x\ — xqI < d, for infinitesimal d, there must be a non-infinitesimal D such that f[xi] > 
f[xg] + f'[xg] {x\ — Xo) whenever |xi — xo| < D. This proves the Local Bending Theorem. 
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1 . Give an algebraic condition that says a curve bends downward. One way to do this is to 
“say” the curve lies below its tangent line. Prove that your condition holds for a small 
real interval containing xq provided f"[xo] < 0. 



8.2 Symmetric Differences and Tay- 
lor’s Formula 

The symmetric difference 

f[x + 6x/2]~ f[x-Sx/2] 

fa 

gives a more accurate approximation to first derivative than the formula 
lf[x + Sx] - f[x])/Sx » f[x]. 

In the computations for Galileo’s Law of Gravity in Ghapter 10 of the main text, we 
used symmetric differences to approximate the derivative. There is an obvious geometric 
reason to suspect this is a better approximation. Look at the figure below. Graphically, the 
approximation of slope given by the symmetric difference is “clearly” better on a “typical” 
graph as illustrated in Figure 8.1 below. 

A line through the points {x, f[x]) and (x + Sx, f[x + <5a;]) is drawn with the tangent at x 
in the left view , while a line through (x — Sx, f[x — Sx]) and {x + 6x, f[x + Sx]) is drawn with 
the tangent at x in the right view. The second slope is closer to the slope of the tangent, 
even though the line does not go through the point of tangency. 




Figure 8.1: {f[x + 5x\ — f[x\)/5x and {f[x + 5x\ — f[x — 5x\)/5x 
Now we use the second order Taylor formula to prove the algebraic form of this geometric 
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8. Higher Order Derivatives 



condition. Substitute bxj2 and —5x12 into Taylor’s Second Order Formula to obtain 
f[x + 5x/2] = f[x] + f'[x]5x/2 + + ei5x^ 

f[x - 5x/2] = f[x] - f[x]5x/2 + ^/"N + e 2 Sx‘^ 

subtract the two to obtain 

f[x + i5a;/2] — f[x — 5x/2] = f'[x]Sx + (ei + £ 2 ) 5x^ 

Solve for f'[x] obtaining 

^/r 1 f[x + Sx/2]- f[x-5x/2] 
f [a;] = ^ £4 5x 

with £4 « 0. This formula algebraically a better approximation for f'[x] than the ordinary 
increment approximation f[x + 5x] = f[x] + f'[x]5x + e^5x which gives 

Note the importance of 5x being small: £4 • (5x is a product of two small quantities. 



[ Exercise set 8.2 1 

1 . (a) Sketch the line through the points {x — 5x, f[x — Sx]) and {x,f[x]) on the left view 

of Figure 8.1. 

(b) Substitute x ± Sx into Taylor’s second order formula and do some algebra to 
obtain the approximation 



r\x] 



f[x + 5x\ — f[x — (5a;] 
2 Sx 



+ £5 • (5x 



(c) Show that the average of the slopes of the two secant lines on the left figure is 
{f[x + (5a;] — f[x — 5x\)/{25x), the same as the slope of the symmetric secant line 
in the second view. 

(d) A quadratic function q[dx] in the local variable dx that matches the graph y = f[x] 
at the three x values, x — Sx, x, and x + Sx is given by 

q[dx] =yi + dx + — — [dx{dx - (5a;)] 



where y\ = f[x], 7/2 = f[x + Sx] and 7/3 = f[x — 5x\. Verify that the values agree 
at these points by substituting the values dx = 0, dx = Sx and dx = —Sx into 
q[dx]. 

(e) Show that the derivative g'[0] = {f[x + Sx] — f[x — Sx])/{25x), the same as the 
symmetric secant line slope. 

A quadratic fit gives the same slope approximation as the symmetric one, which is also 
the same as the average of a left and right approximation. All these approximations are 
“second order. ” 
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It is interesting to compare numerical approximations to the derivative in a difficult, but 
known case. The experiments give a concrete form to the error estimates of the previous 
exercise. When we only have data (such as in the law of gravity in Chapter 9 of the main text 
or in the air resistance project in the Scientific Projects), we must use an approximation. 
In that case the symmetric formula is best. 

2 . Numerical Difference Experiments 

In the Project on Exponentials, you compute the derivative of y = b* directly from 
the increment approximation. Type and enter the following Mathematica program and 
compare the two methods of finding y'[0] 

b = 2; 
y[t_j bAt; 
t = 0; 

Do[ 

dt — 0.5 An; 

Print[dt,(y[t + dt] - l)/dt,(y[t + dt] - y]t - dt])/(2 dt)] 

,{n,0,16}] 

N]Log]b],10] 



8.3 Approximation of Second Deriva- 
tives 

fU, T _ f[^ + - 2/W + f[x -Sx] 



Substitute Sx and —Sx into Taylor’s Second Order Formula to obtain 
f[x + 6x] = f[x] + f[x]6x + ^f"[x] (Sx)'^ + eiSx"^ 

f[x - 5x\ = f[x] - f[x]Sx + ^f'[x] (Sx)^ + 
add the two to obtain 

f[x + 5x] — 2f[x] + f[x — 5x\ = f'lxjSx"^ + (ei + £ 2 ) dx^ 

Solve for f"[x] to give a formula approximating the second derivative with values of the 
function: 

f[x + Sx]-2f[x] + f[x-Sx] , 

/ FJ- 



[ Exercise set 8.3 1 







92 



8. Higher Order Derivatives 



1. Second Differences for Second Derivatives 

(a) The acceleration data in the electronic homework for the Chapter on velocity 
and acceleration is obtained by taking differences of differences. Suppose three 
X values are x — Sx, x and x + 5x. Two velocities correspond to the difference 
quotients 

f[x] - f[x-Sx] f[x + Sx] - f[x] 

5x Sx 

Compute the difference of these two differences and divide by the correct x step 
size. What formula do you obtain? 

(b) Compare the approximation for f'[x] preceding the Exercise Set to the answer 
from part (a) of this exercise. What does the comparison tell you? 

(c) Make a program like the one in Exercise 8.2.2 above to compute this direct nu- 
merical approximation to the second derivative and compare it with the exact 
symbolically calculated derivative of b* . 



8.4 The General Taylor Small Oh For- 
mula 



The general higher order Taylor formula is the following approximation of 
the change function 5[<5a;] = f[x-\-Sx] by a polynomial in the change variable 
Sx ( or sometimes dx ) when x is held fixed. 



Continuity of all the derivatives is equivalent to the fact that the approximation works 
for all the values of x strictly inside the interval. The converse result is given below. 

Theorem 8.3. Taylor’s Small Oh Formula 

Suppose that the real function f[x] has n real function continuous derivatives on 
the interval (a,b). If x is not infinitely near a or b and Sx ^ 0 is infinitesimal, 
then the natural extension functions satisfy 

f[x-\-Sx] = f[x]-\-f[x]-Sx-\-^f'[x]-Sx'^-\-^^f^^'>[x]-Sx^-\ h^/^”^[x]-fe”+e-(5a;” 

for e « 0. 

Equivalently, for every compact subinterval [a, (3] C (a,b), 

f[x + Ax] - {f[x] + f'[x] ■ Sx + y"[x] ■ Ax'^ H h ^ 

Ax” 

uniformly in [a, [3]. 



Before we give the proof of this approximation formula, we would like you to see for 
yourself how it looks. The claim of the theorem is “local,” that is, the approximations are 
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better than Sx'^, but only for small Sx, or only ‘in the limit.’ (Notice that if Sx = 0.01 and 
n = 3, this means that the error is small compared to Sx^ = 0.000001.) 




Figure 8.2: Sine and Taylor 

You will need to review Integration by Parts from Chapter 12 of the main text, 




F[u] dG[u] = F[x]G[x]\i 




G[u] dF[u] 



in order to follow the proof of Taylor’s formula. 

Taylor’s Remainder Formula using Integration by Parts 

When n = 1, Taylor’s approximation is the increment equation of Definition 5.2. However, 
we want to derive a general formula for the error e using integration. In the case where 
n = 1, this is just uses the Fundamental Theorem of Integral Calculus 5.1. 



pOX pOX pOX 

/ [f'[x u] — f'[x\] du= f'[x-\-u]du— / f'[x] du 

Jo Jo Jo 

pSx p6x 

= / f'[x u] du — f'[x] / du 

Jo Jo 

p6x 

= / f'[x + u]du — f' [cc] • 5x 
Jo 

= f[x + fe] - f[x] - f'[x] ■ Sx 

because if we take F[m] = f[x + u], then dF[u] = f'[x + u] du and the Fundamental Theorem 
of Integral Calculus says dF[u] = F[b] — F[a]. Rearranging the calculation we have the 
first order formula 

p5x 

f[x + Sx] = f[x] + f[x] • + / [f[x -\-u]~ f[x]] du 

Jo 
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8. Higher Order Derivatives 



Integration by Parts shows 
f[x + 5x] = f[x] + f[x] ■ 6x + ^f'[x] • H 

1 r„(ra-l) rSx 

■■■+ -J^-\x] ■ fe" + • / (1 - [/(")[x + h] - /(")[x]] du 

n\ (n-1)! Jo 



because 



(n — 1)! 



• /(")[x] • t\l - du = • /(")[x] . - = ^ . /(")[x] 



(n-1)! 



n n! 



and Integration by Parts with F[u] = (1 — u/SxY’^ and dG[u] = + n] du gives 

rSx 



Jo 



I fOX 

-•/ /(”)[x + n] dn = 

• Jo 



r„(n-2) 



which could be further reduced (or used as an inductive hypothesis), 

r„(ra-2) rSx 

— • / (l-n/fe)("-2) [/("-!) + du 

(n-2)! Jo 



1 



X„(n-3) /••5a: 

/(”-2)[cc] • + .p . / (1 - n/fe)(”-3) /(”-2)[x + n] dn 

(n-3)l Jo 



(n-2)l 



The Error Formula; We have shown that 

f[x + Sx] = f[x] + f[x] ■Sx + Jfix] • 5x^-\ ^ -(5x” + £-(5x” 

with the explicit formula 



£ 



1 

6x 



cSx 



(1 — u/SxY'^ 

(n- 1)1 



[/(")[x + n]-/(")[x]] du 



Now we will show that £ is small when 6x is small. 

Proof of Taylor’s Small Oh Formula 

To show that £ « 0, it is sufficient to notice that continuity makes [x + n] -/W[x] «0 

for 0 < u < 5x, so the maximum 



TO = Max[|/^"^[x + m] — /^"^[x]| : 0 < n < dx] « 0 



and 



£ < 



TO {1 — u/Sx)^^ 



Sx 

TO 

6x 



(n — 1)1 



du 



Jo 
Sxm 

7,1 rjJ 



n! n\ 

This completes the proof. The equivalent “epsilon - delta” condition follows as in the proof 
of Theorem 3.4. 
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8.4.1 The Converse of Taylor's Theorem 

Theorem 8.4. The Converse of Taylor’s Theorem 

Let f[x] be a real function defined on an interval (a, w). Suppose there are real 
functions ah[x], h = 0 , - ■ ■ ,k, defined on (a, w) such that whenever a hyperreal x is 
in {a, to) and not infinitely near the endpoints, and Sx « 0, the natural extensions 
satisfy 

k 

f[x + 5x] = ^ — ah[x] 6 x^ + £ ■ 5x^ 

/i— 0 

with £ « 0. Then f[x] is k-times differentiable with derivatives f^^'^[x] = ah[x]. 
Proof; 

First, we show that the coefficient functions are continuous. Consider k = 0. Take Sx = 0 
to see that f[x + 0] = oo[x] +0. Take « 0 to see f[x] is continuous, 

f[x + 6 x] = f[x] +£ 

with e « 0. 

Consider k = 1 and take two infinitesimals Sxo and 6 xi of comparable size, Sxq/Sxi and 
Sxi/Sxo both finite. Expand at ^ = a; + Sxq and at x, 

f[x + Sxq + Sxi] — f[x + i5xo] = ai[x + i5xo] Sxi + £i Sxi 

f[x + Sxo + Sxx] — f[x] + f[x] — f[x + <5a;o] = ai[x] {Sxq + Sxi) — ai[x] Sxo + £2 Sxi 

f[x + Sxo + Sxi] — f[x + i5xo] = oi N Sxi + £2 Sxi 

Solving, a\[x + <5xq] = ai[a;] + {£2 — £ 1 ) proves continuity, 

oi [x + i5a;o] « ai [x] 

The case fc = 2 is Exercise 8.4.3 below. The general case follows by expanding the 
deleted differences {5x^ indicates it is deleted from the expression) for k + 1 comparable 
infinitesimals, Sxq, Sxi,- ■ ■ , Sxk, 

f[x + Sxo + Sxi + • • • + Sxk] 

k 

— ^ f[x + Sxo + 6 x 1 + h SXj + • • • + Sxk] 

i=i 

+ ^ f[x + Sxq + 6 x 1 + \- 5xi~\ H + Sxj + h Sxk] 



k 

(_l)fe-i ^ f^x + Sxo + Sxj] 

+ (—1)^ f[x + <5xq] 

about 5 = X + Sxo and about x, obtaining this expression equal to both 

Ofc[x + i5xo] Sxi ■ 6 x 2 • • • Sxk + £1 • Sx^'^^ = afe[x] i5xi • <5x2 • • • Sxk + £2 • Sxq 
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8. Higher Order Derivatives 



Proof that = ah[x] is by induction on k. The case fc = 1 is the Definition 5.2. 

Suppose that we know = ah[x], for = 0, 1, • • • ,k. Let 5x\ and 8 x 2 be comparable 

infinitesimals and expand two ways: 

f[x + 5x\ + 8 x 2 ] 

= X! "n + 5 xi ] 5 x 2 + ^ 

k 

= E ^ +(5X2)'* + jr^,au+i[x] (5xi + ^ 2 )"+' + £ 2 <5x^+' 

Now form the difference, and write it as a polynomial in ( 5 x 2 , 



k 

£3 • 8x2~''^ = y] [x + 8x1] 8x2 - [x] {8x1 + 8x2)^^ 

h—O 

+ ^ (afc+i[x + 8x1] 8 x2~'''^ - Ofc+i[x] {8x1 + (5x2)*^''’''^^) 

fc + 1 

= y^6fc[x, i 5 xi]( 5 x 2 

h^O 

Expansion in 6x2 gives the terms 

bk+i [x, ( 5 xi] = ,, (ofe+i [x + 8xx] - ttfe+i [x]) 

[k + Ij! 

and 

6fc[x, i 5 xi] = (^f^'"'>[x + 8x1] -Z^'*^[x] -afc+i[x] i 5 xi^ 

Since Uk+i [x] is continuous, we have 

. k—1 

e ■ 8 x2~'''^ = — (^/^'*^[x + 8x1] - f^'"^[x] - afe+i[x] < 5 xi^ 8x2 + bh[x,8x] 8x2 

h=0 

For any Aq, • • • , Afc distinct nonzero real numbers, we have the invertible Vandermonde 
system of equations 



1) Aq, • • • , Aq bo 
l,Ai,--- ,Aj^ bi8x2 

l,Afc,---,A|_ pk8x^ 



£o(Ao(5x 2)'"+^1 fio 

£i(Ai<5a;2)''+^ ii 

: = : 
_£fe(Afc(5x2)''+\ _ifc 



with i/j « 0 for h = 0, ■ ■ ■ ,k. Applying the real inverse matrix to both sides, we obtain 



60 [x, (5xi] 
bi[x, i5xi] 8 x 2 

bk[x, i 5 x 2 ] 8 x 2 
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with ? 7 /j « 0 for h = 0, ■ ■ ■ ,k. In particular, 

bk[x,6xi] = + 6xi] - - afc+i[x] (5xi^ = C 5xi 

with C « 0. This proves that /^^^[x] satisfies Definition 5.2. 



[ Exercise set 8.4 ^ 

1 . Show that the Taylor polynomials for sine at x = 0 satisfy 

SinfO + dx] = dx — - • dx^ + ^ • dx® + • • • 

6 5! 

and use Mathematica to compare the plots as follows 

Plot[{Sin[0 + dx],dx, dx - dxA3/6, dx - dxA3/6 + dxA5/5!} 

, {dx,-3 Pi/2,3 Pi/2}] 

Make similar graphical comparisons for Cos[0 + dx] and Exp[0 + dx] = 

2 . We to work through the steps in finding the second order formula. 

Calculate the following second order integral by breaking it into two pieces, 



nOX 

5x ' / {1 — u/ 5x)[f”[x -\- u] — f”[x\] du 

Jo 

p5x pSx 

= 6x ‘ / {1 — u/Sx)f"[x u] du — Sx / (1 — u/Sx)f"[x] du 

Jo Jo 

p5x p6x 

= 6x- {1 — u/Sx)f"[x + u] du — 6x ■ f"[x] / {l — u/Sx)du 

Jo Jo 



First, compute the integral Jq^( 1 — u/Sx)du = dx/2, by symbolic means or by noticing 
that it is the area of a triangle of height 1 and base Sx. Second, use Integration by Parts 
with F[u] = (1 — u/Sx) and dG[u] = f"[x + m] du to show 



pOX pOX 

Sx ■ / {1 — u/Sx)f'fx + u] du = —Sx ■ f'[x] + / f'[x + u]du 

Jo Jo 

= ■ f'[x] + f[x + Sx] - f[x] 



Finally, combine your second order results to show that 

a6x 



1 

f[x + 5x] = f[x] + f'[x] • Sx + ' Sx‘^ Sx - (1 — u/Sx)[f"[x + u] — /^^[x]] du 

2 Jo 

3 . Suppose that 



f[x-\- Sx] = f[x] ai[x] Sx - a 2 [x] Sx^ + eSx^ 
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8. Higher Order Derivatives 



Take three comparable infinitesimals Sxq, Sxi, 6x2, and expand the following: 

/[C + Sxi + 6x2] - /[^] + /[C] - /[? + <5a;i] - /[^ + 6x2] + f[(\ 

= a2[^]5xx5x2 + ei5xl, S, = x + 5xo 
f[x + Sxq + 6x1 + 6x2] — f[x] + f[x] — f[x + Sxq + 6x1] — f[x + 5 xq + 6x2] 

+ /N - /N + f[x + Sxo] 

= 02[x] SxiSx2 + £2<5Xq 

to show that 02 [x + i5xo] « 02 [x], 02 [x] is continuous. 



8.5 Direct Interpretation of Higher Or- 
der Derivatives 



We know that the first derivative tells us the slope and the second derivative 
tells us the concavity or convexity (frown or smile), but what do the third, 
fourth, and higher derivatives tell us? 



The symmetric limit interpretation of derivative arose from fitting the curve y = f[x] 
at the points x — Sx and x + Sx and then taking the limit of the quadratic fit. A more 
detailed approach to studying higher order properties of the graph is to fit a polynomial 
to several points and take a limit. To determine a quadratic fit to a curve, we would need 
three points, say x — Sx, x, and x + Sx. We would then have three values of the function, 
f[x — <5x], f[x], and f[x + <5x] to use to determine unknown coefficients in the interpolation 
polynomial p[dx] = ao + oi(ix + a2dx^. We could solve for these coefficients in order to make 
f[x — Sx] = p[—Sx], f[x] = p[0], and f[x + Sx] = p[Sx]. This solution can be easily done 
with Mathematica commands given in the next exercise. The limit of this fit tends to the 
second order Taylor polynomial, 

lim p[dx] = f[x] + f'[x] dx + i f"[x] dx^ 

(5 a?— 2 

This approach extends to as many derivatives as we wish. If we fit to n + 1 points, we 
can determine the n + 1 coefficients in the polynomial 

p[dx] = oo + ai dx + • • • + a„ dx" 

so that p[Sxi] = f[x + Sxi] for i = 0, 1, • • • , n. If the function f[x] is n times continuously 
differentiable. 



lim p[dx] = f[x] + f'[x] dx + - f'[x] dx^ H H 7 /^"^[x] dx" 

5a?^0 2 7T.! 
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specifically, if p[dx] = oo[x, 6x] + oi[x, 6x] dx + • • • + a„[x, 6x] dx", then 



lim ak[x,6x] 



1 






for all fc = 0, 1, • • • ,n 



uniformly for x in compact intervals. The higher derivatives mean no more or less than the 
coefficients of a local polynomial fit to the function. In other words, once we understand the 
geometric meaning of the dx^ coefficient in a cubic polynomial, we can apply that knowledge 
locally to a thrice differentiable function. Before we prove this amazing fact, we would like 
you to “see” how it works by using Mathematica to fit the polynomials in Exercise 8.5.1 



8.5.1 Basic Theory of Interpolation 

Let f[x] be a real-valued function defined on (a, to), and let X = {xq, xi , . . . , x„} be n-l- 1 
distinct points in the interval. The “Lagrange form” of the polynomial of degree n that has 
the same values at the Xi is 

n n 

Px[x]=J2.f[Xi] J] 

2=0 

We say “px[x] interpolates / on X.” For example, when n = 2, 

Px[x] = 

_ (X-Xi)(x-X 2 ) . , (x-Xo)(x-X 2 ) . , (x-Xo)(x-Xi) 

" (xo - Xi)(xo - X 2 ) (xi - Xo)(xi - X 2 ) ^ (X 2 - Xo)(x 2 - Xi) 

By substitution we see that px[xi] = f[xi] for f = 0, 1, . . . , n . A polynomial of order n 
with this interpolation property is unique, because the difference of two such polynomials 
has n -|- 1 zeros and thus is identically zero. 

The interpolation polynomial may also be written in “Newton form” with ascending terms 
depending only on successive values of x,: 

n— 1 

Px[x] = ao[xo] + ai[xo,xi]{x - xq) H h an[xo , . . . ,Xn] J]^(x - x^) 

Substitution of xq in the Newton form shows 

ao[a;o] = f[xo] 

Equating the nth order terms in both the Newton and Lagrange form, we obtain 
(NewtLa) a„[xo, . . . , x„] = ^ 

^=0 J] {Xj - Xi) 

This formula (NewtLa) for a„[xo, . . . ,x„] shows the symmetry of the coefficients. That is, 
if ki is any permutation of {0, 1, . . . , n}, then 



(Symm) 



O-n [x ko : ■ • ■ 5 [^0 y • ■ • : Xn\ ■ 
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8. Higher Order Derivatives 



Applying the formula (NewtLa) to the right hand side of the next equation and putting 
the resulting expression on a common denominator justifies the divided difference recursion 



(DiffQ) 



I [^0 5 ■ • ■ j ^n] — 



'j— 1 [xi , . . . , Xji\ CLri—1 [^0 5 ■ ■ ■ ? ^n— l] 



Xfi Xq 



Successive substitution into (DiffQ) shows 
Oo[xo] = 
ai[xo,xi] = 



ao[a;o] = f[xo] , 

f[xi] - f[xo] 



a2[xo,Xi,X2] = 



xi - Xo 
1 



f f[x 2 ] - f[xi] f[xi] - f[xo] 



X2 — Xo \ X2 — Xi 



Xl - Xo 



Because of the relation of the Newton coefficients to divided differences of f[x], we denote 
them by 

^ f[xo 5 ■ ■ ■ 5 Xn\ — Clji [xo 5 ■ ■ ■ j Xji\ 



If f[x] is differentiable at a point, we may extend the definition of <5"/ to include a 
repetition. This extension continues to satisfy the functional identity (DiffQ). 

Theorem 8.5. Limiting Differences 

(a) If f'[xn] exists (as pointwise limit) then the following limit exists, 

Iimi5”+VN, • ■ • ,Xi,Xi + t,... ,Xn] 

We denote this limit f[xo, ■ ■ ■ ,Xi-i,Xi,Xi,Xi+i, . . . ,x„]. 

(b) If f[x] is pointwise differentiable on (a,w) and {xq, . . . ,x„+i} has one rep- 
etition amongst Xq, . . . ,x„, then the functional identity (DiffQ) still holds 

<5”+VN,-- - ,Xn-i-l] = |i5”/[xi,... ,x„+i] -i5”/[xo,... ,x„]| 

with the extended definition of part (a) as needed. 



PROOF(a): 

If n = 0, the limit 



f[xQ + t] - f[xo] 
t^o t 



f'[xo] 



exists and we may define S^f[xo,xo] = f'[xo]- 

If n > 0, Newton’s interpolation formula at distinct points xq, xq + 1, xi, X2, ... , x„ gives 



f[xo + t] 



f[xo] + '^tS'-f[xo,... 
2=1 



2-1 

,Xj] J]^(xo + t 

i=i 



Xj) 



n 

+ • • • ,x„,xo + t] J]^(xo + t - Xj) 
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SO that 



^ 5 3^1 5 ■ ■ ■ 5 Xjij Xq ~h t] 

_ - 6 ^f[xo,xi] - E”=2'^V[a;o,- - ■ ,x^]]Xj^!li^o+t- Xj) 

lV^=i{ 3 :o + t - Xj) 

^ f'[xo] - f[xo,Xi] - fl^O, ■ ■ ■ ,Xi\Y\]~J^{xo - Xj) 

Yl%i{xo - Xj) 

This proves (a). 

Proof (b): 

To prove (b), we may use symmetry and assume xq = xi, so by definition of the extended 
formula and identity (DiffQ), 



5"+VN,a:i, . . . ,Xn+i] = lmi5”+VN,a:o + t, - ,a:„+i] 



= lim 



1 



t^O Xn+l - Xq 
1 



<5”/[a;o + t,x2 ,... , x„+i] - (5"/[xo, xo + t,... ,x„] 



3^n+l 3^0 



|(5”/[a;i,a;2, ■ • ■ ,Xn+i] - <5”/[xo, a;o, a^ 2 , • ■ • ,a:„]| 



8.5.2 Interpolation where / is Smooth 

If we know that f[x] has n ordinary continuous derivatives (Definition 5.2), then we have 
the following elegant formula. 

I Theorem 8.6. Hermite-Gennocd’s Formula 

Suppose f[x] has n derivatives in (a,w), for n > 1. Choose distinct points 
xo, . ■ . ,x„in {a,uj). Then 

(5”/[xo, . . . ,s„] = / r. • • • / /^"^[toa^o H \- tnXn]dti . . . dtn 



i> 0 ,^ ti<l 



' a^o)“t“***“t“tn(a^n a^o)] dt i . . . dt^ 



Proof; 

First, the two integrals are equivalent because 



= l-to 



i=l 



Second, if n = 1, the Fundamental Theorem of Integral Calculus, with G[f\ = f[xo - 
t{xi — xo)]/(a;i — Xo) and ^ = f'[xo + t{x\ — a;o)], shows the Hermite-Gennoci Formula, 

dV[a;i,a;o] = ^ f 

Xl Xo Jo 
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8. Higher Order Derivatives 



Third, for n > 1, use successive integration and The Fundamental Theorem to show that 
the the Hermite-Gennoci integrals satisfy the recursion (DifFQ): 



i>0,^ ii<l 



J + ti{xi - Xo) H h tn{xn - xo)dti ...dt 

i-l 

= ^ +ti{xi - Xo) -\ h tn{Xn ~ Xo))dtn 



t 



/t„=0 



dti . . . dtn—i 



[■■■ [ /^” ^^[xo+ti{xi - Xo) -\ 'rtn{Xn-Xo)\ 

J J Xn- Xo 



dti . . . dtfi—i 









{y • • - y /*■" ^\xn + ti{xi - X„) H h - Xn))dti . . .dtn-l 

~ I ■ y +^l(a;i - a^o) H 1- tn-l{Xn-l - Xo))dti . . .dtn-l} 

Since both 5”/ and the integrals agree when n = 1 and since both satisfy the recursion 
(DifFQ), the two are equal and we have proved the theorem. 

If f[x] = x”, then = n! and 5”/[xo, . . . ,x„] = 1 by equating coefficients of x”, so 



1 = 






n! dti ■ ■ ■ dtr, 



8.5.3 Smoothness From Differences 

We say S'^f is S'-continuous on (a, oj) if whenever we choose nearby infinitesimal se- 
quences, we obtain nearly the same finite results. That is, suppose xq « xi « • • • « x„ are 
distinct infinitely close points near a real b in (a, oj) and « • • • « are also near 

b. (The Xi points are distinct and the points are distinct, but the sets {xg, . . . ,x„} and 
{Co, • ■ • , Cn} may overlap.) Then 

5"/[xg,... ,X„]«r/[Co,... ,C„] 

and both are finite numbers. 

Theorem 8.7. Theorem on Higher Order Smoothness 

Let f[x] be a real funetion defined on a real open interval {a, to). Then f[x] is n- 
times continuously differentiable on {a,uj) if and only if the nth-order differences 
6'^f are S -continuous on (a, w). 

Proof that Smooth Implies Stable Differences: 

The implication follows from the Hermite-Gennoci Formula, Theorem 8.6, and shows 

5"/[xg,...,X„]«y /(")[&] 

whenever xg « • • • « x„ « 6. 
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If a;o « xi « • • • « Xn, then for all 0 < < 1, + 1- tnXn] ~ so 



5f^'^\xo,--- ,x„] 




i=0 



/^”^[toa;o H h t„x„]dti 




i=0 



We prove the converse by induction and need some technical lemmas. The case n = 0 is 
trivial and the case n = 1 follows from Theorem 3.5. 

Theorem 8.8. Technical Lemma 1 

If f[x] is pointwise differentiable with derivative f'[x] on (a,uj) and xq, xi, . . . Xn 
are distinct points in (o;,w), then 

n 

( 5 ') S'^f[xo,... ,Xn] = - ,Xn,Xt] 

i=0 



Proof of Lemma 1: 
If n = 0, 



5°f'[xo] = f[xo] = lvmS'^f[xo,xo+t] = 6'^f[xo,xo] 



Assume that the formula holds for n and that xq, . . . ,x„+i are distinct points in (a,w). 
Use the recurrence formula (DiffQ), 



5"+V'N,-- - ,x„+i] = 



^ n +1 ^0 



. ..Xn+l\ - l5”/'[xo, ... ,X„ 



Next, use the induction hypothesis, 



<5”"^ /'[Xo,... ,Xn+l] 



1 

^n+1 ^0 



n+1 



(5”+ V[xi, • ■ • , Xn+l,xf\ - (5"+ V[xo, • ■ • , X„, xf\ 






2=0 
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8. Higher Order Derivatives 



Finally, use part (b) of Theorem 8.5 



• ■ • ,Xn+l] 

1 



^ n +1 ^0 



,Xn+l,Xn+l] 

n 

+ ^ (^( 5 ”+V[a;i, • ■ • , Xn+i,x^] - ( 5 "+ VN, • ■ • ,Xn, a;*]) 

i=l 

-5”+VN,--- ,x„,a:o]| 



= X! , a^n+i, a;*] 

/■ 

1 



^n+1 ^0 

1 

^n+1 ^0 



'J”"^V[a;i, ■ • ■ ,a;„+i,a:„+i] - <5"+V[a;i, ■ • ■ ,Xn+i,xo] 
S""~^^f[xo,xi, . . . ,Xn+i] - <5”+VN,a;i, . . . ,a;„,a:o]| 



n+1 



= ,Xn+l,Xi] 



i=0 



This proves the lemma. 

Theorem 8.9. Technical Lemma 2 

Ij ^n+i f S-continuous on {a,uj), then 6^ f is also S-continuous for all k = 
0,1,... ,n. 

Proof of Lemma 2: 

It is sufficient to prove this for k = n, hy reduction. Suppose xq ~ a;i « . . . « a;„ 
and Co ~ ~ are near a real b. We wish to show <5”/[xo, . . . , Xn] ~ ■ ■ ■ ) C«] 

and both are finite. We may assume that {a;o, . . . ,x„} yf {Co, . ■ • ,C«} and if there is an 
overlap between the sets let ccm, a^m+i, • • ■ ,a:„ be the overlapping points. Take Co = Xm, 
Cl = Xjn+I,- ■ ■ ,Cn-m = x„. Now we have yf Ci and |a;o,a:i, . . . , a;y, Cj, Ci+i. ■ • ■ ,Cn} a set 
of n + 1 distinct infinitely close points for each j. 

To show that <5”/[xo, . . . ,a:„] « <^”/[Co, • • • ,Cn]) we form a telescoping sum and apply 
identity (DifFQ): 



S'^flxo, ... ,Xn\- l5”/[Co, • ■ • , Cn] 

n 

— ^ ^ ^ fl^Oi ■ ■ ■ 1 j • j ^ fl^Oi ■ ■ ■ 5 1 5 5 ■ • ■ 5 ?n] 

i-0 
n 
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By hypothesis all the n + 1 order differences are near the same finite number so <5"/[x] « 
since Xj « 

We can also use this identity to show that the nth order differences are finite. Since the 
identity holds for all infinitely close points, The Function Extension Axiom 2.1 shows that 
it must almost hold for sufficiently close differences (say within real 9 for x's within real 77). 
A real difference is finite, so the nearby infinitesimal one is too. 

Proof that Stable Differences Implies Smooth: 

We need to show that the differences with repetition are S'-continuous, not just defined 
as hyperreals. It follows from The Function Extension Axiom 2.1 (as in the proof of Theo- 
rem 3.4) that for sufficiently small infinitesimal t, 

(5”+VN,a:i , ... ,x„,xo + t]!^ • ■ • ,x„,xq] 

If Co « 6 ~ ~ ~ « • • • « a;„, then 

S""~^^f[xo, ... ,Xn, a;o] « ■ ,Xn,Xo + t] 

,C„,Co + s] 

«5"+V[Co,... ,C„,Co] 

So S'-continuity of implies S-continuity of with one repetition. (This is the 

theorem on first order smoothness if n = 0.) 

Our induction hypothesis is applied at n to the function g[a;] = f'[x], that is, we assume 
that it is given that when S’^g is S'-continuous, g is n-times continuously differentiable. Now 
apply ((5'): 

n 

(5”/'[xo,... ,Xn] = ,Xn,Xi] 

i^O 

n 

i^O 

= <5”/'[C0,... ,Cn] 

SO 5”/' is S-continuous and /' is n-times continuously differentiable. This proves that / is 
n -|- 1 times continuously differentiable as claimed. 



[ Exercise set 8.5 1 

1 . Local Higher Order Fit 

First, make a table of values to fit: 

n — 1; 

X — 0.0; 
dx — 0.5; 
f[x_] Expfx] 

values = Table [{x + k dx,f[x+ k dx]} ,{k,-n,n}] < Enter > 
Next, make a list of basic functions: 

Clear [dx]; 

polys — Table [dxAi,{i, 0,2 n}] < Enter > 
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Now Fit[] the data: 

p — Fit[values, polys, dx]; 

p< Enter > 

And finally, Plot[] for comparison: 

Plot[{ Expfx + dx],l + dx + dxA2/2, p} ,{dx,-2,2}] 




Figure 8.3: 2nd Comparison 

Take dx = 0.25, dx = 0.125 and compare the coefficients of your Fit[] p[dx] to the 
Taylor polynomial. 

Extend your program to fit a polynomial of degree 4 and use the program to compare to 
the Taylor coefficients in the limit as Sx —> 0. 
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Basic Theory of the Definite Integral 

The second half of the Fundamental Theorem technically requires 
that we prove existence of the integral before we make the estimates 
given in the proof. 



We are assuming that we have a function f[x] and do not know whether or not there is 
an expression for its antiderivative. We want to take the following limit and be assured that 
it converges. 



lim {f[a]Ax+f[a+Ax]Ax+f[a+2Ax]Ax+f[a+3Ax]Ax+- ■ ■+ f[b—2Ax]Ax+ f[b— Ax]Ax) 
AxlO 

If the limit exists it equals f[x] dx, 




or 



lim {f[a]Ax + f [a + Ax] Ax + f [a + 2 Ax] Ax 
Aa:J.O 

+ f[a + 3Ax]Ax + • • • + f[b — 2Ax] Ax + f[b — Ax] Ax) 




dx « f[a]Sx + f[a + 6x]Sx + • • • + f[b — 6x]Sx 



for every 



0 < (5x 



0 



When f[x] is continuous and [a, b] is a compact interval, of course this works as we prove 
next. When f[x] is not continuous or if we want to integrate over an interval like [l,oo), 
then the theory of integration is more complicated. Later sections in this chapter show you 
why. (Even the first half of the Fundamental Theorem does not work if we only have a 
pointwise differentiable antiderivative, because then f[x] can be discontinuous.) 

The proof of the first half of the Fundamental Theorem of Integral Calculus 5.1 does not 
require a separate proof of the convergence of the sum approximations to the integral. The 
fact that the limit converges in the case where we know an antiderivative F[x], where dF[x] = 
f[x] dx as an ordinary derivative, follows directly from the increment approximation. 

When we cannot find an antiderivative F[x\ for a given f[x], we sometimes still want to 
work directly with the definition of the integral. A number of important functions are given 
by integrals of simple functions. The logarithm and arctangent are elementary examples. 
Some other functions like the probability function of the “bell-shaped curve” of probability 
and statistics do not have elementary formulas but do have integral formulas. The second 
half of the Fundamental Theorem justifies the integral formulas, which are useful as approx- 
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9. Basic Theory of the Definite Integral 



imations. The NumIntAprx computer program shows us efficient ways to estimate the 
limit of sums directly. 



9.1 Existence of the Integral 



The next result states the approximation result we need without writing a 
symbol for f[x] dx. Once we show that I exists, we are justified in writing 
I = f[x] dx. Continuity of f[x] is needed to show that the limit actually 
“converges. ” 



Theorem 9.1. Existence of the Definite Integral 

Let f[x] be a continuous function on the interval [a, b]. 
I such that 

b—Ax 



lim 

Axj.0 



= I 

x—a 
step Ax 



Then there is a real number 



or, equivalently, for any 0 < (5a; « 0 the natural extension of the sum function 
satisfies 

b—Sx 

[f[x] 6x] « I 

X—a 
step Sx 



Proof; 

First, by the Extreme Value Theorem 4.4, f[x] has a min, m, and a Max, M, on the 
interval [a,b]. Monotony of summation tells us 



b—5x 

m X {b — a) < E [f[x] Sx] < M X {b — a) 

X—a 
step Sx 



So that [/[x] (5x] is a finite number and thus near some real value /[5a;] « [/[x]5xj. 

step Sx step Sx 

What we need to show is that if we choose a different infinitesimal, say Su, then I[6x] = I[Su] 
or 

b—Sx b—Su 

E ~ E 

X—a u—a 

step Sx step Su 

In other words, we need to compare “rectangular” approximations to the “area” with differ- 
ent step sizes. These step sizes may not be multiples of one another. This creates a messy 
technical complication as illustrated next with several finite step sizes. (You can experiment 
further using the program Graphint Approx.) 

The next two graphs show a function over an interval with 12 and 13 subdivisions. 
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If we superimpose these two subdivisions, we see 




Figure 9.1: Both 12 and 13 Subdivisions Together 



Notice that the overlaps between the various rectangles are not equal sizes. Here is 
another example with 17 and 15 subdivisions: 




Again, the overlapping portions of the rectangles are unequal in size. 
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Figure 9.2; Both 17 and 15 Subdivisions Together 



As a first step, we will make upper and lower estimates for the sum with a fixed step size. 
We know by the Extreme Value Theorem 4.4 that for each Ax and each x = a + k Ax < b, 
the function f[x] has a max and a min on the small subinterval [x, a; + Aa;], 

f[xm] < /K] < f[xM], for any ^ satisfying a; < ^ < a: + Aa: 

We may ‘code’ this fact by letting Xm = Xm[xj Aa;] and xm = xm[x, Aa;] be functions that 
give these values. We extend and sum these to see that 

b—5x b—Sx 

[f[xm]Sx]< Y [/W^ 2 ;] 

x—a x—a 

step 6x step Sx 

and 

b—Sx b—6x 

Y [/w 

X—a X—a 

step 6x step 6x 

When « 0 is infinitesimal, we also know that f[xm] ~ /[xm] and in fact that the largest 
difference along the partition is also infinitesimal, 

0 < Max[(/[a;M[a;, i5a;]] — f[x„i[x, iJa:]]) : a; = a, a + Sx, a + 2 Sx, • • • < 5] = 0 « 0 

This shows that our upper and lower estimates are close, 

b—Sx b—Sx 

Y [f[xM] Sx] - Y = 

X—a X—a 

step Sx step Sx 

b—Sx 

= Y f[Xm])Sx] 

X—a 
step Sx 

b—Sx 

< 9 ■ "Y^ [<^a;] = 0 • (6 — a) « 0 

X—a 
step Sx 

We have shown that for any equal size infinitesimal partition, the upper and lower esti- 
mate sums are infinitely close. Now consider the unequal partition given by overlapping a 
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5x partition and a 5u partition for two different infinitesimals. A lower estimate with the 
min of f[x] over the refined partition is LARGER than the min estimate of either equal 
size partition, because the subinterval mins are taken over smaller subintervals. An upper 
estimate with the max of f[x] over the refined partition is SMALLER than the max esti- 
mate over either equal size partition. The difference between the refined upper and lower 
estimates is infinitesimal by the same kind of computation as above. We therefore have 

/[fe] « Min 5x-Sum « Max 5x-Sum and Min 5u-Sum « Max i5M-Sum « I[du] 

Min (5x-Sum < Refined Min Sum < Refined Max Sum < Max (5a;-Sum 
Min (5u-Sum < Refined Min Sum < Refined Max Sum < Max <5u-Sum 

so I[Sx] « I[Su] and since these are real numbers I[Sx] = I[Su]. This proves that the integral 
of a continuous function over a compact interval exists. 



[ Exercise set 9.1 1 

1 . Keisler’s Proof of Existence 

Let f[x] be continuous on [a, 6]. We want to show that the integral exists. This is 
equivalent to showing that for every two positive infinitesimals Sx and Su we have 

b—Sx b—6u 

x—a u—a 

step 5x step 6u 

(a) First, show that we can reduce the problem to the case where f[x] > 0 on [a,b]. 
(HINT: Consider f[x] = F[x] + m where m = Min[F[x] \ a < x <b].) 

(b) Given two positive infinitesimals 5x and Su, show that we can reduce the problem 
to showing that for any positive real number r we have 

b—5x b—Su 

[f[x]6x]<r+ Y [/M^w] 

X—a u—a 

step 5x step 5u 

(c) Let r be a positive real number and take c= {b— a)/r. Below you will show that 

b—5x b—5u 

Y [/W^2;]< Y iifM+c)Su] 

X—a U—a 

step Sx step Su 

Why does this establish (b)? 

(d) To prove that the inequality in (c) holds, suppose to the contrary that 

b—Sx b—Su 

Y Y {if{u]+c)Su] 

X—a U—a 

step Sx step Su 

Show that then there must be a pair of points x and u in [a, b] so that 
X — 5u<u<x + 5x and f[x] > f[u] + c 
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(e) When f[x] is continuous and 5x « 0, we cannot have x — 5u<u<x + 5x and 
f[x]>c+f[u\. Why? 

(f ) Why does the contradiction of (e) prove that the integral exists ? 

(HINTS: Step (d) is the hard one. The sums are areas of regions bounded by x = a, 
X = b, the x-axis, and step functions ssx[x] and S 5 „[x]. When the Sx-sum is larger 
than the Su-sum, the region below S 5 a;[x] cannot lie completely inside the region below 
Let v satisfy S 5 a;[v] > Siu[v]- The value = f[x] for a Sx-partition point, 

X <v < x+ 6x and the value = f[u] +cfor a 5u-partition point, u< v < u + 6u. 

Show that X — Su<u<x + Sx. 

When Ax and Au are real, concoct functions that give the needed values of x = X[Ax] 
and u = U[Au], then apply the implication when Sx and Su are infinitesimal.) 



9.2 You Can’t Always Integrate Dis- 
continuous Functions 



With discontinuous integrands, it is possible to make the sums oscillate as 
Ax ^ 0. In these cases, numerical integration by computer will likely give 
the wrong answer. 



Exercises 12.7.3 and 12.7.4 in the main text show you examples of false attempts to 
integrate discontinuous functions by antidifferentiation. The idea in those exercises is very 
important - it is a main text topic. You can not use the Fundamental Theorem without 
verifying the hypotheses. 

More bizarre examples are possible if you permit the use of weakly approximating point- 
wise derivatives. Examples 6.3.1 and 6.3 show how a single oscillatory discontinuity in a 
pointwise derivative can lead to unexpected results. It is possible to tie a bunch of these 
oscillations together in such a way that the resulting function has oscillatory discontinuities 
on ‘a set of positive measure.’ In this case we have the pointwise derivative DxF\x] = f[x], 
but the limit of sums trying to define the integral do not converge to F[b] — E[a]. 

There are two different kinds of discontinuity preventing convergence of the approximat- 
ing sums for integrals. Isolated infinite discontinuities like the ones cited above from the 
main text are easiest to understand and we discuss them below in a section on “improper” 
integrals. There is also a project on improper integrals. 

Oscillatory (even bounded) discontinuities are much more difficult to understand. B. 
Riemann discovered the best condition that would allow convergence of the sums. The 
integral is often called the “Riemann integral” as a result. This is peculiar, because the 
notation for integrals originated in 1675 in a paper of Leibniz, while Riemann’s integrability 
result appears over 150 years later in his paper on Fourier series. It took a very long time for 
mathematicians to understand integration of discontinuous functions. (You too can progress 
very far by only integrating continuous functions.) 
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Riemann was interested in passing a limit under the integral, 

pb pb 

lim / fn[x] dx= lim /„[x] dx 
Ja Ja 

In his particular case the functions fn[x] were Fourier series approximations. This idea 
seems harmless enough, but the appearance is deceiving. If we can do this once, we should 
be able to do it twice. We want you to see just how awful limits of functions can be. 

Example 9.1. A Very Discontinuous Limit 

The following limit exists for each x, but the limit function is discontinuous at every 
point. 

lim lim n\ x])^ = Iq[x] 

n—*oo 7n—*oo 

If X = p/q is rational, then when n> q, Cos[ 7 rn! |] = 1, so 

/ 

lim ( CosIttu! -] ) =1 

m^oo q J 



lim 1 = lim lim Cos[ 7 rn! -] =1 

n— ^oo n— ^oo m— ^oo V Q. J 



If x is not rational, then no matter how large we take a fixed n, 

lim (Cos[ 7 rn! x])™ = 0 

m—*oo 

since the fixed quantity | Cos[ 7 rn! x]| < 1. The limit of zero is zero, so when x is fixed and 
irrational, 

lim 0 = lim lim (Cos[ 7 rn! x])"* = 0 



n— ^oo n—^oo m—*oo 



Together, these two parts show that the limit exists and equals the indicator function of 
the rational numbers, 

I 1, if X is rational 
Iq[x\ = < 

10, if X is irrational 



This function is discontinuous at every point since every x has both rational and irrational 
points arbitrarily nearby. In other words, we can approximate ^ « x « 77 with Iq[^] = 1 
and IqIp] = 0, so 

^ « 77 but = I and IqIp] = 0 

It is even difficult to make a useful plot of low approximations to the limit. 




Figure 9.3: (Cos[7t3! x])® 
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We can show that 



lim lim [ (Cos[7rn! a;])™ dx = 0 

n—*oo m — »-oo / _ 

J — 7T 



and the limits defining the ordinary integral 



Iq[x] dx = / ( lim lim Cos'" [tt n! cc] ) dx do not converge. 

/ _ \n—*oo 7n—*oo / 

J — 7T 



You cannot interchange these limits and the ordinary integral. 

After Riemann, the study of Fourier series motivated Lebesgue’s work on integration that 
ultimately led to a more powerful kind of integral now called the Lebesgue integral. Lebesgue 
integrals of continuous functions are the same as the integrals we have been studying, but 
Lebesgue integrals are defined for more discontinuous functions and satisfy more general 
and flexible interchange of limit and integral theorems. When you really need to integrate 
wild discontinuities, study Lebesgue integrals. 



9.3 Fundamental Theorem: Part 2 



The second part of the Fundamental Theorem of Integral Calculus says that 
the derivative of an integral of a continuous function is the integrand, 

ill 



The function A[X] = f[x] dx can be thought of as the “accumulated area” under the 
curve y = f[x] from a to A shown in Figure 9.4. The “accumulation function” can also be 
thought of as the reading of your odometer at time X for a given speed function f[x]. 




Figure 9.4: A[X] = f[x] dx 
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Theorem 9.2. Second Half of the Fundamental Theorem 

Suppose that f[x] is a continuous function on an interval containing a and we 
define a new function by “accumulation, ” 

^[^]= [ /N dx 

J a 

Then A [X] is smooth and ^ [X] = / [X]; in other words, 

ixL 



Proof; 

We show that A [X] satisfies the differential approximation A[X + 5X]—A [X] = f [X] 6X+ 
e ■ SX with e « 0 when 6X « 0. This proves that f[X] is the derivative of ^[^]- 




Figure 9.5: f[x] dx 



By definition of A and the additivity property of integrals, for every real AX (small 
enough so X + AX is still in the interval where / [a;] is continuous) we have 



pX-\-^X nJ'i. 

A[X + AX] = / f[x]dx= / f[x]dx + 

J a J a 

pX^AX 

= A [X] + / f[x] dx 

Jx 

l-X+AX 

A[X + AX]- A[X]= f[x]dx 

Jx 



rX+AX 



’X 



f[x] dx 




The Extreme Value Theorem for Continuous Functions 4.4 says that f[x] has a max and 
a min on the interval [X,X + AV], m = f [Xm] < f[x] < M = f [Xm] for all A < a; < 
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X + AX. Monotony of the integral and simple algebra gives us the estimates 

I^X+AX pX+AX pX+AX 

m ■ AX = mdx< f[x] dx< M dx = M ■ AX 

Jx Jx JX 

m- AX <A[X + AX] - A[X]<M ■ AX 

A [X + AX] - A [X] 



m < 



AX 



< M 



m = f[Xm] < f[X] < /[Xm] = M 
A[X + AX] - A [X] 



AX 



-f[X] <f[XM]-f[Xm] 



with both Xm and Xm in the interval [X, X + AX]. 




Figure 9.7: Upper and lower estimates 

The Function Extension Axiom 2.1 means that this inequality also hold for positive 
infinitesimal 6X = AX, 



A[X + 5X]~ A [X] 



5X 



-f[x] 



< /[^m] - f[Xm] 



We know that Xm ~ Xm when SX « 0. Continuity of f[x] means that / [Xm] ~ / [Xm] ~ 
/ [X] in this case, so 

A[X + 5X]-A[X] _ 

JX + ^ 

with e « 0. So, 

A[X + SX]-A [X] = f[X] • JX + £ • JX 

with £ « 0, when JX « 0. This proves the theorem because we have verified the microscope 
equation from Definition 5.2 

A[X + 5X]= A [X] + A' [X] JX + £ • JX 

with A' [X] = / [X]. 



[ Exercise set 9.3 

1 . A Formula for ArcTangent 

Prove that ^ 

ArcTan[x] = 



(HINT: See Example 5A.) 





Improper Integrals 



119 



9.4 Improper Integrals 



The ordinary integral does not cover the case where the integrand function 
is discontinuous or the case where the interval of integration is unbounded. 
This section extends the integral to these cases. 



There are two main kinds of improper integrals that we can study simply. One kind has 
an isolated discontinuity in the integrand like 



— ;= dx 

V X 



and 




dx 



The other kind of improper integral is over an unbounded interval like the probability 
distribution 

2 r -X- ^ r ^ a 

— ;= / e dx or — dx 

J-oo Jl X 

In both of these cases, we can calculate f[x] dx and take a limit as c ^ 6. The theory of 
these kinds of limiting integrals is similar to the theory of infinite series. We begin with a 
very basic pair of examples. 

Example 9.2. dx 

The function has a discontinuity at a; = 0 when p > 0, but we can compute 



lim / — dx 






XP 



The Fundamental Theorem applies to the continuous function 1/a;^ for 0 < 6 < a; < 1, so 



nl 1 .i ^ 

— dx= X ^ dx 



xP 






= J 1-P" 
\Log[x]|^, 



I, if P 7^1 
if p = 1 

if p < 1 



i-p ’ 

= <;-Log[6], ifp=l 
ifp>i 



The limits in these cases are 



lim 



r-1^ 
k xP 



lim 



i-b^-p 






p 



1 

i-p> 



dx = { limbi 0 - Log [6] = lim{,|o Log[l/&] = lime 
. linifeio — = OO) 



if p < 1 

, Log[c] = oo, if p = 1 
if p > 1 
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To summarize, we have 



X 



dx = { i-P 






Jo [oo, if p > 1 

Now we consider the other difficulty, an unbounded interval of integration. 



Example 9.3. /“ l/x^ dx 



The integrand 1/x^ is continuous on the interval [1, oo), so we can compute 



Again we have cases, 



r 1 

lim / — dx 

b-i-ooj-^ XP 



— dx = 
xP 



dx 



ifp^l 

\Log[x]|?, ifp=l 

r^, ifp<l 

= <^Log[6], ifp=l 

ifp>l 



The limits in these cases are 



Aimb_ 



OO 1—p 



ro ^ .-V 

lim / — dx = I limfe^oo Log [6] = oo, 
6^00 xP 1-1/6P-1 1 

[limt,^^ p-i = 



if p < 1 
if p = 1 
if p > 1 



To summarize, we have the opposite cases of convergence in the infinite interval case that 
we had in the (0, 1] case above, 

f x~Pdx=l^' ifp>l 

Ji oo, if p < 1 



Infinite intervals of integration arise often in probability. One such place is in exponential 
waiting times. This can be explained intuitively as follows. Suppose you have light bulbs 
that fail in some random way, but if one of them is still working, it is as good as new, that 
is, the time you expect to wait for it to fail is the same as if it was new. If we write P[t] for 
the probability of the bulb lasting at least until time t, then the ‘good as new’ statement 
becomes 



Probability of lasting to t + At given that it lasted to t 



P[t + At] 
P[t] 



P[At] 



If we re-write this algebraically, this probabilistic statement is the same as the exponential 
functional identity (see Chapter 2 above for the background on functional identities), 



P[t + At] = P[t] X P[At] 
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If we assume that P[t] is a smooth function, we saw in Example 2.7 above that 

P[t] = e~^ * 

for some A > 0. Example 2.7 shows that P[t] = and the reason that we have a 
negative constant k = —A in our exponential is because the light bulb can not last forever, 

limt^oo P[t] = limt^oo = 0. 

Notice that P[0] = 1 says we start with a good bulb. 

We can think of the expression X e~^ * dt as the probability that the bulb burns out 
during the time interval [t, t + dt). (See the next exercise.) 

Example 9.4. Expected Life of the Bulb 

The average or expected lifetime of a bulb that fails by an exponential waiting time is 
given by the improper integral 

t A e~^ * dt 

This can be computed with integration by parts, 




u = t dv = X e ^ * dt 

du = dt V = —e~^ * 



so 



t Xe 



-\ t 



dt = —t e 



— X t \b 

lo ■ 



o-A t 



dt 



= be~^'> 



X ^ 



= \ + be~^^-\ e-^'^ 

A A 

We know (from Chapter 7 of the main text) that exponentials beat powers to infinity, so 
lim 6 e-^ i e-^ '’ = lim - -Arr = 0 

6^00 A b^oo ° X ° 

We have shown that the expected life of the bulb is 



t X e~^^dt=- 



9.4.1 Comparison of Improper Integrals 

The most important integral in probability is the Gaussian or “normal” probability re- 
lated to the integral 

2 

Erf[AT] = — / dx 

You saw in the NoteBook Symbolicintegr that this does not have an antiderivative that 
can be expressed in terms of elementary functions. Mathematica calculates this integral in 
terms of “Erf[ ],” its built in Gaussian error function. We often want to be certain that 
an integral converges, but without calculating the limit explicitly (since this is sometimes 
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impossible). We may do this by comparing an integral to a known one. This is similar to 
the comparison tests for convergence of series in the main text Chapter 18 on Series. 

In the case of the integral above, for |a:| > 1, so 




and 





dx 



The estimating integral converges, 



pOD pb 

/ e~^ dx = lim / dx = lim (e — e~^) 

2 

SO the tails of the e~^ integral converges and 



r*-l 



Exercise set 9.4 ]■ 



1. 


Improper Drill 














1 


fo 1/v^ = 


2 


r 


1 / y/x dx = 


3 


pOO 

Jo 


l! y/x dx = 


4 


fo dx = 


5 


r 


’ 1 /x"^ dx = 


6 


poo 

Jo 


1/x^ dx = 


7 


fo 1/-^ dx = 


8 


r 


' 1 / f/x dx = 


9 


poo 

Jo 


1 / ^ dx = 


10 


fo 1/a;^ dx = 


11 


r 


’ 1/x^ dx = 


12 


poo 

Jo 


1/x^ dx = 


2 . 


Show that 
















P[t] = 




e~' 


dt= lim 
6—^00 ^ 


1 A e-^ * dt 


= e' 


— A t 



and 



X e~^ * dt = 1 - e~^ * 



3 . Calculate the integral 



X e 



dx 



symbolically using a change of variables. Explain your answer geometrically. 
Calculate the integral 

2 i 



e-" 



dx 



/7T 



4 . Prove that the integral 



Sin[cc] 



dx 



converges. (Is Sin[a;]/a; really discontinuous at zero? Consider its power series.) 
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Do the integrals 

r r Wf!! d, 

Jo X Jo X 

converge? This is a tough question, but perhaps you can at least use Mathematica to 
make conjectures. 

The previous exercise is related to conditionally convergent series studied in the Mathe- 
matical Background Chapter on Series below. Lebesgue integrals are always absolutely 
convergent, but we can have conditionally convergent improper integrals when we define 
them by limits like 

Jo ^ 

5. The Gamma Function 

The Gamma function is given by the improper integral 

poo 

r(s) = / dt 

Jo 

This integral has both kinds of improperness. Why? Show that it converges anyway by 
breaking up the two cases 




Use Integration by Parts to show T(s -h 1) = s F(s). 

Use s = n, a positive integer and induction on the functional identity above to show 
that r(s -I- 1) is an extension of the factorial function, 

r(n -I- 1) = n! 

9.4.2 A Finite Funnel with Infinite Area? 

Suppose we imagine an infinite funnel obtained by rotating the curve y = If x about the 
x-axis. We can compute the volume of the funnel by slicing it into disks, 




Figure 9.8: y = If x Rotated 
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/ OO pOC) 

7rr^(x) dx = TT / x~‘^ dx 



6 . Finite Volume 

Calculate the integral above and show that the volume is finite (tt). 

Paradoxically, the surface area of this infinite funnel is infinite. If you review the cal- 
culation of surface area integrals from Chapter 12 of the main text, you will obtain the 
formula 

/.OO I j- 

Area = 7T / — \/l-| — j dx 

Ji x\ x^ 

7 . Infinite Area 

Show that the surface area integral above is infinite by comparing it to a smaller integral 
that you know diverges. 

Perhaps it is a good thing that we can’t build infinite objects. We would run out of paint 
to cover them, even though we could fill them with paint... 
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Derivatives of Multivariable Functions 



Functions of several variables whose partial derivatives can he com- 
puted by rules automatically are differentiable when the function 
and its partial derivative formulas are defined on a rectangle. 



Theorem 10.1. Defined Formulas Imply Approximation 

Suppose that z = f[x,y] is given by formulas and that the partial derivatives 
^[x^y] and ^[x,y] can be computed using the rules of Chapter 6 (Specific Func- 
tions, Superposition Rule, Product Rule, Chain Rule) holding one variable at a 
time fixed. If the resulting three formulas f[x,y], ^[x,y], ^[x,y], are all defined 
in a compact box, a < x < (3, 7 <y<? 7 , then 



f[x + 5x,y + 5y] 



= If 



6x - 



dy 



[x, y] ■ 5y -\- e ■ \J dx"^ + dy’^ 



with e uniformly small in the {x,y)-box for sufficiently small Sx and Sy. 



The “high tech” reason this theorem is true is this. All the specific classical functions 
are complex analytic; having infinitely many derivatives and convergent power series expan- 
sions. Formulas are built up using these functions together with addition, multiplication 
and composition - the exact rules by which we differentiate. These formation rules only 
result in more complex analytic functions of several variables. The only thing that can “go 
wrong” is to have the functions undefined. 

Despite this clear reason, it would be nice to have a more direct elementary proof. In 
fact, it would be nice to show that uniform differentiability is “closed” under the operations 
of basic calculus and specifically including solution of initial value problems and indefinite 
integration in particular. Try to prove this yourself or WATCH OUR WEB SITE! 
ht tp : / / WWW . math . uiowa. edu / stroyan / 
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Theory of Initial Value Problems 

One of the main ideas of calculus is that if we know 

(1) where a quantity starts, a; [to] = xq 

and 

dx 

(2) how the quantity changes, — = f[t,x] 

then we can find out where the quantity goes. The basic start and 
continuous change information is called an initial value problem. 

The solution of an initial value problem is an unknown function x[t\. This chapter shows 
that there is only one solution, but since the solutions may not be given by simple formulas, 
it also studies properties of the solutions. 

11.1 Existence and Uniqueness of So- 
lutions 

If we know where to start and how to change, then we should be able to 
figure out where we go. This sounds simple, but it means that there is only 
one possible place for the quantity to go by a certain time. 

The precise one dimensional theorem that captures this is: 

Theorem 11.1. Existence & Uniqueness for I. V. P.s 

Suppose that the functions f[t,x] and ^[t,x] are continuous in a rectangle around 
(xQjto). Then the initial value problem 

x[to] = Xo 
dx = f[t, x] dt 

has a unique solution x[t] defined for some small time interval to — A < t < to + A. 
Euler’s Method converges to x[t] on closed subintervals [to,ti\, for t\ < to + A. 

The Idea of the Proof 
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The simplest proof of this theorem is based on a “functional.” The unknown variable in 
the differential equation is a function x[t]. Suppose we had a solution and integrated the 
differential, 



f dx[t] = f dt 

Jto Jto 

x[T]-x[to\= f f[t,x[t]]dt 

Jtn 



We may think of the integral as a function of functions (or “functional.”) We make a 
translation of variables, u[t] = x[t + to] ~ xq, g[t,u] = f[t + to,u + xq]. Given an input 
function v[t], we get an output function by computing 



rc = G[v], where i(;[r] = / (/[t, u[tj] dt 

Jo 



An equivalent problem to our differential equation is: Find a “fixed point” of the func- 
tional G[u], that is, a function u[t] with m[0] = 0 so that G[u] = u. Notice that 



■“M = / g[t^u[t]] dt 






u = G[u\ 



See Exercise 11.1.1. 

The proof of the theorem is very much like the computation of inverse functions in 
Theorem 5.7 above. We begin with uo[^j = 0 and successively compute 



ui = G[mo] 

U 2 = G[ui] = G[G[uo]j 

U3 = G[u 2 ] = G[G[ui]j = G[G[G[uo]]j 



This iteration is an infinite dimensional discrete dynamical system or a discrete dynamical 
system on functions. If we choose a small enough time interval, 0 < t < A, we can show 
that the functional G[-] decreases the distance between functions as measured by 

\\u — w|| = Max[|u[t] — v[t]\ : 0 < t < A] 

That is, IIG]^] — G[w]|| < r Hu — u||, with a constant r satisfying 0 < r < 1. (We will 
show that we can take r = 1/2 below.) The proof of convergence of these approximating 
functions is just like the proof in Theorem 5.7 once we have this “contraction” property. 
The iteration scheme above reduces the distance between successive terms, and we can prove 
that ||u„+i — u„|| < r" • ||ui — uo|| by recursively using each inequality in the next as follows: 

||m 2 - Uill = ||G[mi] - G[uo]|| < r ■ ||ui - uo|| 

||m 3 - U2II = ||G[m 2] - G[ui]|| < r ■ ||G[ui] - G[uo]|| < • ||ui - uoll 

\\UA - U 3 II = ||G(u 3 ) - G[m2]|| < r ■ ||G[m2] - G[ui]|| < • ||ui - uoll 



||u„+i - u„|| < r” • ||ui - uoll 
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The sequence of iterates tends to the actual solution, ^ m, the function satisfying 
u = G[t6]. To see this we use the geometric series from Chapter 25 of the main text on the 
numbers ||m„+i — m„||. Notice that the distance we move for any number m steps beyond 
the approximation satisfies 

^ [t] — I [t] + 1 [t] * * * [t] | 

^ |'^^n+m[t] Uti-I-tti,— ! [t] I + [t] 'U 7 i-|_ 77 i _2 [t] | T • * * 

+ |M„+l[t] - Un [ t ]\ 

— Il^n+m ill “t“ ||'Uj 2 -|_j 7 i_l 'Uj 2 -|_j 7 i _2 || “t“ * * * 

+ ||u„+i - M„|| 

<r"-||ui-Mo||-(r™ + r’"-^ + --- + l) 

1 _ „va+l 1 

< r" • ||ui - Moll • — j < ?■” • ||mi - Moll • -j 

1 — r 1 — r 



We use of the explicit formula ( 7 -”^ -(- 7 .™-! for the finite geometric series. 

(We use 0 < r < 1 in the last step.) 

Uniqueness follows from the same functional, because if both u and v satisfy the problem, 
M = G[m] and v = G[m], then 

||m — m|| = ||G[m] — G[v]|| < r ||m — m|| with r<l 

This shows that ||m — v|| =0, or u = v as functions, since the maximum difference is zero. 
This is all there is to the proof conceptually, but the details that show the integral functional 
is defined on the successive iterates are quite cumbersome. The details follow, if you are 
interested. 

The Details of the Proof 

The proof uses two maxima. In a sense, the maximum M below gives existence of 
solutions and the maximum L gives uniqueness. This is where the hypothesis about the 
differentiability of f[t,x] enters the proof. 

In Exercise 1 1. 1. 2 you show that if f[t,x] and ^[t,x] are defined and continuous on the 
compact rectangle |x — xo| < fo and |t — to| < bt, then the maxima M and L below exist 
and 



M = Max[|/[t,a;]| : \x - cco| < to| < bt] 

= Max[|g[t,M]| : |m| < bxk\t\ < bt] 

and 



L = Max[ 
= Max[ 



dx 

du 



[t,x] 

[Cm] 



: ]x - xo] < b^k]t-to] < h] 



: |m| < bx&i]t] < bt] 



An important detail in the proof is a prior estimate of the time that a solution could last. 
(We know from Problem 21.7 of the main text that solutions can “explode” in finite time.) 
This is transferred technically to the problem of making G[m] defined when v = G[m]. 
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11. Theory of Initial Value Problems 



As long as g[t, u[i]] is defined for 0 < t < t, we know our functional G[-] satisfies 



|GM(r)| 



g[t,u[t\] dt 



< M 



dt 



'to 



= M -T 



when < M. This is a little circular, because we need to have defined in 

order to use the estimate. What we really need is that g[t,Un+i[t]] is defined on 0 < t < r 
provided that g[t, Un[t]] is defined on 0 < t < r and g[t, ■uo[t]] is also defined on 0 < t < r. 

Exercise 11. 1.3 shows the following successive definitions. Let uo[t] = 0 for all t. Then 
g[t,uo[t]] = f[t + to,xo] is defined for all t with 0 < t < r as long as t < bt (the constant 
given above.) Let Ai be a positive number satisfying Ai < bt and Ai < bx/M (for the 
maximum M above and in Exercise 11.1.2.) Let ui[t] = G[wo][t]. Then |t6i[f]| < bx, so 
that g[t,ui[t]] is defined for 0 < t < Ai. Continuing this procedure, the whole sequence 
Un+i = G[un] is defined for 0 < t < Ai. 

The maximum partial derivative L is needed for the contraction property. For each fixed 
t and any numbers u and v with |m| < bx and |ri| < bx, we can use (integration or) the Mean 
Value Theorem 7.1 (on the function F[u] = g[t,u]) to show that 



\g[t,v] - g[t,u] \ < L-\v-u\ 



This is called a “Lipschitz estimate” for the change in g. See Exercise 11.1.4. 

Let A be a positive number with A < Ai and A < 1/(2 L). For any two continuous 
functions u[t] and v[t\ defined on [0, A] with maximum less than bx, 



dt- f 
Jo 

< [ \g[t,u[t]]- g[t,v[t]] \ dt 
Jo 

< L ■ Max[|'u[t] — u[t]| : 0 < t < A] dt 

Jo 

< j L\\u — v\\ dt < \\u — v\\ ■ L ■ A < - \\u — v\\ 

Jo 2 

This shows that the iteration idea above will produce a solution defined for 0 < t < A and 
completes the details of the proof. 

Once we know that there is an exact solution, the idea for a proof of convergence of Euler’s 
method given in Section 21.2 of the core text applies and shows that the Euler approxima- 
tions converge to the true solution. (The functional G[-], called the Picard approximation, 
is usually not a practical approximation method.) 



g[t,v[t]] dt 



|GM-GH|(r) = 



g[t, u[t]] 



[ Exercise set 11 .1] 



1 . Define the functional G\u] as in the proof of Theorem 11.1 Show that a function u[t] 
satisfies u = G[u\ if and only ifx[t+to] = u[t]+xo satisfies x[to] = xq and ^ = f[t, x[t\\. 
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2. If f[t,x] and ^[t,x] are defined and continuous on the compact rectangle \x — xq\ < bx 
and \t — tQ\ < bt, show that the maxima M and L below exist and 

M = Max[\f[t,x] \ : \x - xq\ < bxk\t - to\ < bt] 

= Max[\g[t,u] \ : |m| < bxk\t\ < bt] 



L = Max[ ■ k - a^ol < bxk]t - to] < bt] 

do 

= Max[ -^[tyu] : |w| < bx&^]t\ < bt] 

3. Let uo[t] = 0 for all t. Prove that (/[t, uo[^]] is defined for all t with 0 < t < t as long 
as T < bt (the constant given in the Exercise 11.1.2 above.) 

Let Ai be a positive number satisfying Ai < bt and Ai < bx/M (for the maximum M 
in Exercise 11.1.2.^ Let ui[f] = G[mo][1] and show that |wi[t]| < bx, so that g[t,ui[t]] is 
defined for 0 < 1 < Ai . 

Continue this procedure and show that the whole sequence Un+i = G[u„] is defined for 
0<t<Ai. 

4. Calculate the integral 

i 

( where t is fixed.) 

If h[x] is any continuous function for ]x] < bx, and : ]x] < bx] < L (in 

particular, if h[x] = g[t,x]) show that 



pv pz 

/ h[x] dx < L ■ / 

J U J U 



dx\ = L • \v — 



provided |i6| < bx and |?;| < bx. Which property of the integral do you use? 

Combine the two previous parts to show that for any t with |t| < bt and any numbers u 
and V with |m| < bx and l-yl < bx, 

\g[tyv] - g[t,u]] < L-\v-u] 



11.2 Local Linearization of Dynamical 
Systems 



Now we consider a microscopic view of a nonlinear equilibrium point. 
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11. Theory of Initial Value Problems 



Theorem 11.2. Microscopic Equilibria 

Let f[x,y] and g[x,y] be smooth functions with f[xe,ye] = g[xe,ye] = 0. The flow 

of 



dx 

dt 

dt 



f[x,y] 

g[x,y] 



under infinite magnification at {xe,ye) appears the same as the flow of its lineariza- 
tion 







U 


bx by 




V 



where 



dx 


Gy 




dx 


dy 




by ^ 






fa. 




dx 


dy_ 



[Xe,ye] 



Specifically, if our magnification is \/5, for (5 « 0, and our solution starts in our 
view, 

(x[0] - Xe, y[0] - j/e) = ^ • (a, b) 

for finite a and b and if satisfies the linear equation and starts at 

(w[0],v[0]) = {a,b), then 



{x[f\ - Xe,y[t] - ye) = S ■ {u[t],v[t]) +(5 • (e,e[t],Sy[f\) 

where {ex[f\, Sy[t]) « (0,0) for all finite t. 

Equivalently, for every screen resolution 0 and every bound fi on the time of 
observation and observed scale of initial condition, there is a magnification large 
enough so that if |a| < fi and |6| < ft, then the error observed at that magnification 
is less than 9 for 0 < t < /3, and in particular, the solution lasts until time ft. 



Proof; 

We give the proof in the 1-D case, where the pictures are not very interesting, but the 
ideas of the approximation are the same except for the technical difficulty of estimating 
vectors rather than numbers. 

We define functions 



Z[t] = - {x[f\ - Xe) 
0 



and 



F[z] = -^ f[z ■ S Xe 



when x[t] is a solution of the original equation. z[t] is what we observe when we focus a 
microscope of magnification 1/5 at the equilibrium point Xe, with f[xe] = 0 and watch a 
solution of the original equation. We want to compare z[t] starting with z[0] = a to the 
dxi 

solution of — = bu where b = f'[xe] and m[0] = a as well. Exercise 11.2.1 shows: 

dz dx 

(a) z[t] satisfies — = F[z], when x[f\ satisfies — = f[x]. 

(b) If z is a finite number, F[z\ ks b • z. 

(c) Let p > 0 be any real number so that the following max is defined and let 



L = Max[|/'[a;]| : |x — Xe| < p] + 1 



Then if zi and Z 2 are finite. 



\F[z 2 ] - F[zi]\ < L\z 2 - Zi\ 
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Our first lemma in the microscope proof is: 

Theorem 11.3. Lemma on Existence of the Infinitesimal Solution 
dz 

The problem — = F[z] and z[0] = a has a solution defined for all finite time. 
Proof of the Lemma: See Problem 11.1. 

Let u[t] satisfy ^[0] = a and ^ = b ■ u. (Ignore the fact that we know a formula in the 
1-D case.) Define e[t] by 

z[f\ = u[t] + s[t] 

Now use Taylor’s formula for f[x] at Xe, and use the fact that /[z[t]] = 

F[z] = i/[xe + 5z\ 

= f[Xe] + f[Xe] ■ {u[t] + e[t]) + ( {f[Xe + S (5z] - f[Xe\) Z ds 

Jo 

^ ^ ^ • {u[t] + e[t]) + ^ {f[xe + s6z]~ f[xe\) z ds 

so 

^ =bs[t]+r][t] 

with ? 7 [t] « 0 for all finite t. 

This differential equation is equivalent to 

s[t] = / 77[s] ds + b / e[s] ds 

Jo Jo 

so for any positive real a, no matter how small (but not infinitesimal) and any finite t, 

|£[t]| <0 + 6 f |e[s]| ds 

Jo 

Which implies that 

\e[t]\ < oe*-* 

and since a is arbitrarily small, e[f\ « 0 for all finite t. 

To see this last implication, let 

H[t] = 0 + 6 f |e:[s]| ds 
Jo 

so |£[f]| < H[t]. We know H[0] = a and H'[t] = b\e[t]\ < bH[t] by the second half of the 
Fundamental Theorem of Integral Calculus and the previous estimate. Hence, 



H'[t] 

H[t] 



< 6 



H'jt] 

H[t] 



dt < 



6 dt 



Log[^^^] <bs 



Ft[s] <ae 



h t 
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11. Theory of Initial Value Problems 



This proves the infinitesimal microscope theorem for dynamical systems, but better than 
a proof, we offer some interesting experiments for you to try yourself in the exercises. 

This is a finite time approximation result. In the limit as t tends to infinity, the nonlinear 
system can “look” different. Here is another way to say this. If we magnify a lot, but not by 
an infinite amount, then we may see a separation between the linear and nonlinear system 
after a very very long time. As a matter of fact, a solution beginning a small finite distance 
away from the equilibrium can ‘escape to infinity’ in large finite time. 



[ Exercise set 11.^ 



1 . Show that z[t] satisfies — = F[z], when x[t] satisfies — = f[x]. 

Show that if z is a finite number, F[z] b • z. 

Let p > 0 be any real number so that the following max is defined and let 



L = Maa;[|/'[x]| : |x — Xe| < p] + 1 



Show that when z\ and Z 2 are finite, 



\F[Z 2 ] - F[zi]\ < L\z 2 - Zi\ 



The following experiments go beyond the main text examples of magnifying a flow near 
an equilibrium where the linearized problems are non-degenerate. 

2 . Use the phase variable trick to write the differential equation +2-^ -I- a; -I- 15 = 0 

as a two dimensional first order system with f[x, y] = y and g[x, y] = —x — 2y—15x^. 
Prove that the only equilibrium point is at (xe,ye) = (0,0). 

Prove that the linearization of the system is the system 



■ 0 1 ■ 




u 


-1 -2 




V 



and that it has only the single characteristic value r = —1. 

Use the Flow2D.ma NoteBook to solve the linear and nonlinear systems at various 
scales. A few experiments are shown next. Notice the different shape of the nonlinear 
system at large scale and that the difference gradually vanishes as the magnification 
increases. The first three figures are nonlinear, and the fourth is linear at same scale 
as the small nonlinear case. 





Local Linearization of Dynamical Systems 



139 




3 . Use the phase variable trick to write the differential equation = Q as a two 

dimensional first order system with f[x,y] = y and g[x,y] = —x^. 

Prove that the only equilibrium point is at (xe,ye) = ( 0 , 0 ). 

Prove that the linearization of the system is the system 



du ~\ 

dt 

dv 

dt 



0 1 ' 




U 


0 0 




V 



and that it has only the single characteristic value r = 0. 

Use the Flow2D.ma NoteBook to solve the linear and nonlinear systems at various 
scales. A few experiments are shown next. The first three figures are nonlinear, and 
the fourth is linear at same scale as the small nonlinear case. What is the analytical 
solution of the linear system? 




Problem 11.1. Existence of the Infinitesimal Solution 

dz 

Prove that the problem — = F[z] and z[0] = a has a solution defined for all finite time. 



HINTS: 
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Apply the idea in the proof of existence-uniqueness, Theorem 11.1 above. Define, 



zo[t] = a 

Zn+i[t] = a+ / F[zn{s)] ds 
Jo 



We have \z\[t] — zo\f\\ < Jg J^[o] ds\ < t |i^[a]| pc b ■ a ■ t is finite. Next, 



\z 2 [t] - Zi[t]\ < [ |i^[z 2 (s)] - F[zi(s)]| ds 
Jo 

< L |zi(s) — a| ds 
Jo 



< L- 



\Z2[t] - o| < \Z2[t] - Zi[f\\ + \zi[f\ - a\ 

< \F[a]\ ■ \Lt+^{Ltr\ < If’HI • \l + Lt+^{Ltr\ 



Continue by induction to show that 



\zn+i[t]-a\ < \F[a]\\l + Lt+^{Ltf + --- + ^{Ltr\ < \F[a]\e^* 

This shows that Zn[t] is finite when t is. We can also show that Zn\t] z[t], the solution to 
the initial value problem, as we did in the existence-uniqueness theorem. 

A 



11.3 Attraction and Repulsion 



This section studies the cases where solutions stay in the microscope for 
infinite time. 



The local stability of an equilibrium point for a dynamical system is formulated as the 
next result. Notice that stability is an “infinite time” result, whereas the localization of the 
previous theorem is a finite time result after magnification. 
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11. Theory of Initial Value Problems 



Theorem 11.4. Local Stability 

Let f[x,y] and g[x,y] be smooth functions with f[xe,ye] = dlxe^Ue] = 0. The 
coefficients given by the partial derivatives evaluated at the equilibrium 



ax 


ay 




dx 


dy 




by ^ 










dx 


dy _ 



[Xe,ye] 



define the characteristic equation of the equilibrium, 



det 




ay 

by-r 



{ax-r){by~r)-ay b^ = r“^ -{ao: + by)r+{axby-ayba:) = 0 



Suppose that the real parts of both of the roots of this equation are negative. Then 
there is a real neighborhood of {xe, ye) or a non-infinitesimal £ > 0 such that when 
a solution satisfies 






with initial condition in the neighborhood, | 2 :[ 0 ] — Xe\ < £ and |y[0] — ye\ < £, then 
lim x[t] = Xe and lim y[f\ = ye 



Proof 

One way to prove this theorem is to ‘keep on magnifying.’ If we begin with any solution 
inside a square with an infinitesimal side 2 e, then the previous magnification result says 
that the solution appears to be in a square of half the original side in the time that it takes 
the linearization to do this. It might be complicated to compute the maximum time for a 
linear solution starting on the square, but we could do so based on the characteristic roots 
in the linear solution terms of the form e“’’ *. It is a fixed finite time r. We could then start 
up again at the half size position and watch again for time t. After each time interval of 
length r, we would arrive nearly in a square of half the previous side. 

If we want to formulate this with only reference to real quantities, we need to remove 
the fact that the true solution is only infinitely near the linear one on the scale of the 
magnification. Since it appears to be in a square of one half the side on that scale, the true 
solution must be inside a square of 2/3 the side within time r. Since this holds for every 
infinitesimal radius, the Function Extension Axiom guarantees that it also holds for some 
positive real £. After time n x r, the true solution is inside a square of side (|) times the 
original length of the side. lim„^oo (§)" = Oj so our theorem is proved. 

Continuous dynamical systems have a local repeller theorem, unlike discrete dynamical 
systems. Discrete solutions can “jump” inside the microscope, but continuous solutions 
‘move continuously.’ You could formulate a local repeller theorem by ‘zooming out’ with 
the microscopic theorem above. How would a ring of initial conditions move when viewed 
inside a microscope if the characteristic values had only positive real parts? 





Stable Limit Cycles 



143 



11.4 Stable Limit Cycles 



Solutions of a dynamical system do not necessarily tend to an attracting 
point or to infinity. 



There are nonlinear oscillators which have stable oscillations in the sense that every 
solution (except zero) tends to the same oscillatory solution. One of the most famous 
examples is the Van der Pol equation: 



d'^x , 



2 






Your experiments in the first exercise below will reveal a certain sensitivity of this form of 
the Van der Pol equation. A more stable equivalent system may be obtained by a different 
change of variables, w = — J x dt, 



Since 



dw 

dt ^ 

dx ,x^. 



d^x dx , n d ,dx ,x^ .. 



= —X 



/ d ,dx ,x^ , , 

dx ,x^ , 

- + a(--x) = w 

dx ,x^ , 



dt = 




Figure 11.1: Van der Pol Flow 
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[ Exercise set 11.^ 

1 . Van der Pol 1 

Show that the following is an equivalent system to the first form of the Van der Pol 
equation via the phase variable trick, y = ^, 

dx 

^ = -x- ay{x^ - 1 ) 

Use Flow2D.ma to make a flow for this system and observe that every solution except 
zero tends to the same oscillation. 

2 . Van der Pol 2 

Use Flow2D.ma to create an animation of the second version of the Van der Pol 
dynamics. 





146 




ill 



The Theory of Power Series 



This chapter fills in a few details of the theory of series not covered 
in the main text Chapters 24 and 25. 



What do we mean by an infinite sum 

ui [x] + U2 [x] + ua[x] + • • • 



The traditional notation is 

oo m 

E Uk[x] = lim Uk[x] 

m — >-oo ' ^ 

where the limit may be defined at least two different ways using real numbers. One way to 
define the limit allows the rate of convergence to be different at different values of x and the 
other has the whole graph of the partial sum functions approximate the limiting graph. The 
weaker kind of limit makes the “sum” have fewer “calculus” properties. The stronger 

limit makes the “infinite sum” behave more like an ordinary sum. The traditional notation 
makes no distinction between uniform and non-uniform convergence. Perhaps this is 
unfortunate since the equation 



Uk [x] dx = 



k=l 



k=l ' 



Uk[x] dx 



is true for uniform convergence (when the Uk [x] are continuous) and may be false for point- 
wise convergence (even when the Uk[x] are continuous). 

The love knot symbol “oo” is NOT a hyperreal number, and is not an integer, because 
it does not satisfy the formal properties of arithmetic (as the Axioms in Chapter 1 require 
of hyperreals. For example, oo -I- oo yf 2oo.) Hyperreal integers will retain the arithmetic 
properties of an ordered field, so we always have 



n n 

/ Uk [x] dx = / Uk [x] dx 

k=l k=l 



when n is a hyperreal integer, even an infinite one. This seems a little paradoxical, since we 
expect to have 

n OO 

y^ufc[x] « y^Mfc[x] 

fe=i fc=i 

It turns out that we can understand uniform and non-uniform convergence quite easily from 
this approximation. The secret lies in examining the approximation when x is hyperreal. If 
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12. The Theory of Power Series 



the convergence is non-uniform, then even when n is infinite the hyperreal sum will NOT 
be near the limit for some non-ordinary x’s. In other words, non-uniform convergence is 
“infinitely slow” at hyperreal x’s. (You might even wish to say the series is not converging 
at these hyperreal values.) 

What do we mean by an infinite sum, 'Uk[x\, when n is an infinite hyperreal integer? 

What do we even mean by an infinite integer n? On the ordinary real numbers we can define 
the indicator function of the integers 



I[x] 



1, if xis an integer 

0, if xis not an integer 



The equation I[m] = 1 says “m is an integer”. The formal statement 
{/[m] = 1, a < x < &} s[m, x] is defined 

is true when m and x are real. The function I[x\ has a natural extension and we take the 
equation I[n] = 1 to be the meaning of the statement “n is a hyperinteger.” The natural 
extensions of these functions satisfy the same implication, so when n is an infinite hyperreal 
and I[n] = 1, we understand the “hyperreal infinite sum” to be the natural extension 

n 

^ WfcN = s[n,x] 



Next, we will show that hyperintegers are either ordinary integers or infinite as you might 
expect from a sketch of the hyperreal line. Every real number r is within a half unit of an 
integer. For example, we can define the nearest integer function 

N[r] = n, the integer n such that |r — n| < - or n = r — - 
and then every real r satisfies 

\r — fV[r]| < - and = 1 

(As a formal logical statement we can write this {x = x} {\x — ?V[x]| < ^,/[fV[x]] = 1}.) 

If m = 1, 2, 3, • • • is an ordinary natural number and |x| < m in the real numbers, then 
we know N[x] = —m or fV[x] = —m -I- 1 or N[x] = —m -|- 2 or • • • or A^[x] = m — 1 or 
iV[x] = m. By the Function Extension Axiom, if x is a finite hyperreal (so |x| < m for some 
m), then A^[x] must equal an ordinary integer in the finite list from —m to m described in 
these equations. 

If X is an infinite hyperreal, then iV[x] is still a “hyperinteger” in the sense /[iV[x]] = 1. 
Since |x — A^[x]| < 1/2, N[x] is infinite, yet s[n,x] = J2k=i ^k[x] is defined. 

Similarly, we can show that hyperreal infinite sums given by natural extension of sum 
function satisfy formal properties like 

m n n 

'^Uk[x]+ ^ Uk[x\ = y^Mfc[x] 

k—1 k—1 
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12.1 Uniformly Convergent Series 

A series of continuous functions 

ui [x] + U2 [x] + ua[x] + • • • 

can converge to a discontinuous limit function as in Example 13.18 below. 
However, this can only happen when the rate of convergence of the series 
varies with x. A series whose convergence does not depend on x is said to 
converge “uniformly. ” 



Following is the real tolerance (“epsilon - delta”) version of uniform convergence. We 
state the definition for a general sequence of functions, fm[x] = s[m,x], defined for all 
positive integers m and x in some interval. 

Definition 12.1. Uniformly Convergent Sequence 

A sequence of functions s[m,x] all defined on an interval I is said to converge to 
the function S'[x] uniformly for x in I if for every small positive real tolerance, 9, 
there exists a sufficiently large real index N , so that all real functions beyond N 
are 0-close to <S'[x], specifically, if m > N and x is in I, then 

|S'[x] — s[m, x] I < 0 

The needed N for a given accuracy 9 does not depend on x in A, N is “uniform 
in x” for 9. 

If we have a series of functions 

Ui [x] + U 2 [x] + U 3 [x] + • • • 
let the partial sum sequence be denoted 



m 

s[m,x] = y^^Ukjx] 



Definition 12.2. Uniformly Convergent Series 

A series of functions all defined on an interval I 

Ml [x] + U2 [x] + M 3 [x] + • • • 

is said to converge to the function S'[x] = Wfc[x] uniformly for x in I if the 

sequence of partial sums s[m,x] converges to S'[x] uniformly on I. 

The equivalent definition in terms of infinitesimals is given in Theorem 12.3. By “extended 
interval” we mean that the same defining inequalities hold. For example, if / = [a, b] then 
a hyperreal x is in the extended interval if a < x < 6. In the case I = (a, oo), a hyperreal x 
is in the extended interval if a < x. 
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Theorem 12.3. A sequence of functions s[m,x\ converges uniformly on the interval to 
the real function S'[a:] if and only if for every infinite n and every hyperreal x in 
the extended interval, the natural extension functions satisfy 

S'[a;] « s[n, x] 



Proof; 

The proof is similar to that of Theorem 3.4. Given the real tolerance condition, there is 
a function such that the following implication holds in the real numbers. 

{a < X <b,9 > Q,m > 1V[6*]} — s[m, x\ \ < 9 

By the Function Extension Axiom 2.1, this also holds for the hyperreals. 

Fix any infinite integer n. Let 0 > 0 be an arbitrary positive real number. We will 
show that |S'[a;] — s[n,a:]| < 9. Since A^[0] is real and n is infinite, n > A^[6*]. Thus the 
extended implication above applies with this 9 and m = n, so for any a < x < b we have 
|S'[a;] — s[n, a;]| < 9. Since 9 is an arbitrary real, S'[a;] « s[n,x]. 

Conversely, suppose the real tolerance condition fails. Then there are real functions 
M[9, fV] and X[9, A^] so that the following holds in the reals and hyperreals. 

{6» > 0,fV > 0} {a < X[9,N] < b,M[9,N] > N, |5[A[6», fV]] - s[M[9, N], X[9, N]] \ > 9} 

Applying this to an infinite hyperinteger N, we see that |S'[A[0, N]]—s[M[9, N],X[9, A^]]| > 9 
and the infinitesimal condition also fails, that is, if x = X[9,N] and n = M\9,N], then 
n > N is infinite and we do NOT have S'[a;] « s[n,x\. This proves the contrapositive of the 
statement and completes the proof. 

Example 12.1. Uniformly Convergent Series 

A series of functions ui[x] + U 2 [x] + U 3 [x] + • • • all defined on an interval I with partial 
sums s[m,x] = Yl'k=i '^k[x\ converges uniformly to S'[a;] = '^k[x\ on I if and only if, 

for every infinite hyperinteger n 



'^Uk[x\ « 

k=l k=l 

for all hyperreal x, a < x < b. In other words if the whole graph of the any infinite sum is 
infinitely near the graph of the real limiting function. 

If a series is NOT converging uniformly, then there is an x where even an infinite number 
of terms of the series fails to approximate the limit function. This can happen even though 
the point x where the approximation fails is not real. We can see this in Example 13.18 and 
Example 12.2. 

We think it is intuitively easier to understand non-uniformity of convergence as “infinitely 
slow convergence” at hyperreal values. 

Example 12.2. cc” Convergence 



s[m, x] = x^ 



The sequence of functions 
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converges to the function 



'S'W 



1, X = 1 

0, — 1 < a; < 1 



In particular, a;™ ^ 0 for — 1 < a; < 1. However, this convergence is not uniform on all of 
the interval — 1 < a; < 1. 



y 




You can see from the graphs above that each function x™ climbs from near zero to a;™ = 1 
when a; = 1. In particular, for every real m there is a 0 < < 1, such that ~ \ 

- the graph y = a;"* crosses the line y = 1/2. The Function Extension Axiom 2.1 says that 
this holds with to = n an infinite integer. (Extend the real function = \/l/2 = /[to, a;].) 
The infinite function a;” is 1/2 unit away from the limit function 5[a;] = 0 when a; = ^„. Of 
course, !«/„<!. 

If we fix any ^ « 1, sufficiently large n will make « 1, but some infinite powers will 
not be infinitesimal. We could say converges to zero “infinitely slowly.” 

Example 12.3. a;” Convergence is Uniform for —r <x<r if 0<r<l 

When r < 1 is a fixed real number, we know that r"* — > 0 and if |a;| < r then |a:'"| < r'", 
so the convergence of x™ to zero is uniform on [— r, r]. 



12.2 Robinson’s Sequential Lemma 



Functionally defined sequences that are infinitesimal for finite indices con- 
tinue to he infinitesimal for sufficiently small infinite indices. 
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Theorem 12.4. Robinson’s Sequential Lemma 

Suppose a hyperreal sequence Sk is given by a real function of several variables 
specified at fixed hyperreal values, • • • , 



If 



= /[fc,-], A: = 1,2,3, • • • 

0 for finite m, then there is an infinite n so that £fc ~ 0 for all k < n. 



Proof: 

Let X = {xi,X 2 ,“- ,Xi) be a real multivariable and suppose f[k,X] is defined for all 
natural numbers. Assume f[n,X] is non-negative, if necessary replacing it with |/[n,A]|. 
(By hypothesis, Sk = f[k, S] is defined at S = (^i, • • • ,^i) for all hyperintegers k, I[k] = 1.) 
Define a real function 

j 1/Max[n : m <n^ f[m,X] < 1/m] when f[h,X] > 1/h for some h 
^ |0, if f[m,X] < 1/m for all m 

In the reals we have either fi[X] = 0 and f[k,X] < 1/k for all k, or pfX] > 0 and 

k < 1/ pfX] f[k,X] < 1/k 

Now, consider the value of Since /[m, S] « 0 for all finite m, /[m, S] < 1/m for 

all finite m. Thus « 0 and either p,[E] = 0 so all infinite n satisfy /[n, S] < 1/n or 
1/^[S] = n is an infinite hyperinteger and all fc < n satisfy f[k,’E] < 1/k. When k is finite 
we already know f[k, S] « 0 and when k is infinite, 1/fc « 0, so the Lemma is proved. 



12.3 Integration of Series 



We can interchange integration and infinite series summation of unifromly 
convergent series of continuous functions. 



Theorem 12.5. Integration of Series 

Suppose that the series of continuous functions Uo[x] -I- ui [x] -I- • • • converges uni- 
formly on the interval [a, 6] to a “sum” 

S'[x] = lim Mo [x] -I- • • • -I- M„[x] 

n — »-oo 



Then the limit S'[x] is continuous and 



nb pb 

/ lim uo[x\ Un [x] dx = lim / mq [ x] -I- • • • -I- m„ [x] dx 

I n—*oo n—*oo I 

J a J a 



Short notation for this result would simply be that 



/ b pb 

Mfe [x] dx = / Mfe [x] dx 

- k=0 fc=0“^“ 



provided the series is uniformly convergent and the terms are continuous. 
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Proof; 

Continuity means that if X\ ~ X 2 , then S'[a;i] « 5[a;2]. We need this in order to integrate 
S'[x]. By continuity of the functions Uk[x], if x\ « X 2 and m is a finite integer, '^k[xi] ~ 

Robinson’s Lemma 12.4 with the function f[m, xi,X 2 ] = J2’k=oi'>^k[xi] — 
Uk[x 2 \) shows that there is an infinite n so that we still have Yl'k=o'^k[xi\ ~ X)fc=o 
Uniform convergence gives us the conclusion 

oo n n oo 

^Ufc[a;i] « ^Ufc[a;i] « '^Uk[x2] « '^Uk[x2] 
k—{) k—Q k—Q k—{) 



The integral part is easy. Let 0 > 0 be any real number and n sufficiently large so that 
\S[x] - (wo[a;] H h u„[x]) \ < 9/{2{b - a)) 




( /■*' 




p \ 


/ woN 


dx + • • • + 


/ u„[x] dx 


a 




! a J 


pb 






/ S[x] 


- (uoN + • • 


• + Un [x] ) dx 


J a 







< f |S'[a;] - (uoN H l-UnN)| dx 

J a 



< 



/ TXT, T dx = (b — a) -7- 7 < 0 

la 2{b-a) ^ ^2{b-a) 



This shows that the series of numbers uq [x ] dx + j\i[ x] dx + U2 [x] dx + • • • converges 

to the number S'[a;] dx. 



Exercise set 12.3 ]■ 



1 . When a series converges uniformly and n is infinite, we know '^k[x\ ~ W- 

Show that 



/ h pb ^ 

Uk [x] dx « / Uk [x] dx 
k=o fc=o 



and 



k=0 

r-b / oo 



Ufc[x] - Ufc[x] j dx « 0 



\fe=0 

What is the meaning of the equation 

rb " 



fc =0 



po 'k pb 

/ 5]] [a;] dx = / Uk [x] dx 

fc=0 fe=0"^“ 



and why is it true? 
Prove that 
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2 . Let /[to, x] = (to + 1) 2 a; (l — 

(a) Show that 



/ /[to,x] dx = 1 for each m 

Jo 

(b) Show that for a fixed real value of x 



lim /[to, x] = 0 

m—*oo 



(c) Show that 



(d) Show that 




dx = 0 




= 1 



(e) Explain why /[to, x[ does NOT tend to zero uniformly. In fact, for a given infinite 
m = n show that there is an x = ^ Q where f[n, ^[ is infinitely far from zero. 

(HINT: See Section 26.3 of the main text.) 



12.4 Radius of Convergence 



A power series ao + oi x + 02 x^ + • • • + a„ x” + • • • does one of the following: 

(a) Converges for all x and converges uniformly and absolutely for 
|a;| < p for any constant p < 00 . In this case we say the series 
has radius of convergence 00 . 

(b) There is a number r so that the series converges for all x in the 
open interval (—r,r), uniformly and absolutely for |x| < p < r for 
any constant p, and diverges for all x with |x| > r. In this case 
we say the series has radius of convergence r. Such a series may 
converge or diverge at either x = r or x = —r. 

(c) The series does not converge for any nonzero x. In this case we 
say the series has radius of convergence 0. 



A general fact about power series is that if we can find a point of convergence, even 
conditional convergence, then we can use geometric comparison to prove convergence at 
smaller values. See Theorem 27.4 of the main text where the following is discussed in more 
detail - but where there is a typo in the proof, (:-(). 
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Theorem 12.6. If the power series 

OiQ (1\ X 0,2 Oin x^ + • • • 

converges for a particular x = Xi, then the series converges uniformly and abso- 
lutely for \x\ < p < \xi\, for any constant p. 

Proof; 

Because the series converges at x\, we must have a„a;” — > 0. If |a;| < p < |a;i|, then 



X 


n 

< 


p_ 


Xi 




Xi 



is a geometric majorant for the tail of the series. That is, eventually |a„x”| < r” and 
converges. This proves the theorem. 

Example 12.4. The Radius of Convergence 

Now consider the cases described in the section summary at the beginning of this section. 
If the series converges for all x we simply say the radius of convergence is oo and apply the 
theorem to see that convergence is uniform on any compact interval. 

If the series diverges for all nonzero x there is nothing to show. We simply say the radius 
of convergence is zero. 

If the series converges for some values of x and diverges for others, we need to show 
that it converges in (— r, r), and diverges for | 2 :| > r. Theorem 12.6 shows that if the series 
converges for x\, then it converges for all real x satisfying |a;| < |a;i|. 

Consider the sets numbers 

T = {s:s<0or the series converges when a; = s} 
i? = {t : t > 0 and the series diverges when x = t} 

The pair (L,R) is a Dedekind cut on the real numbers (see Definition 1.4.) First, both L 
and R are nonempty since there are positive values where the series converges and where 
it diverges. Second, if s € T and t G R, then s < t hy Theorem 12.6. Let r be the real 
number at the gap of this cut. Then whenever |x| < r, |a;| G L and the series converges, 
while when r < |x|, |x| G R and the series diverges at the positive |x|. It cannot converge 
at — |a;| because Theorem 12.6 would make the series converge at (|a;| + r)/2 > r. Thus the 
series converges for |x| < r and diverges for |x| > r. 



[ Exercise set 12.^ 

1. (a) Find a power series with finite radius of convergence r that converges when x = r, 

but diverges when x = — r. 

(b) Find a power series with finite radius of convergence r that diverges when x = r 
and diverges when x = —r. 

(c) Find a power series with radius of convergence oo . 

(d) Find a power series with radius of convergence 0. 

(HINT: Try Log[l + x], 1/(1 — x), make substitutions, • ■ ■ ■) 
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12.5 Calculus of Power Series 



We can differentiate and integrate power series inside their radius of con- 
vergence (defined in the preceding section). 



Theorem 12.7. Differentiation and Integration of Power Series 
Suppose that a power series 

2 n 

oq ai X 02 X + • • • + a„ x + • • • 



converges to 5'[x] for |a;| < r, its radius of convergence, 

oo 

S[x] = ao + ai a; + 02 + 03 + • • • = Ofc x^ 

k=0 

Then, the derivative of S'[a:] exists and the series obtained from term by term dif- 
ferentiation has the same radius of convergence and converges uniformly absolutely 
to it on |a:| < p < r. 



dS'[a:] 

dx 



OO 

oi + 2 02 X + 3 03 a;^ + • • • + n o„ + • • • = k Ofe x^~^ 

fe=i 



The integral of S'[x] exists and the series obtained from term by term integration 
has the same radius of convergence and converges uniformly absolutely to it on 
\x\< p< r. 



J S'[^] df = ao X + y + y H 



n + 1 



OO 



E 



gfc 

fc + 1 






Proof; 

First, we show that the series x^“^ and have the same radius 

of convergence as 
For any x and p, 



|fcofc x^ ^ \ = k 
When we fix |x| < p < r 



lofc p I and 



fcofc x^ ^1 < lofc p*^! and 



k + 1 



k + l 



k+1 



< Ofc P 



X 

p 



■ \ak P 
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for sufficiently large k because then \x/p\ < 1 so 



X 


k 


1 


p 




X 



0 and 



X 


X 


k + l 


p 



as k ^ CO 



See Exercise 12.5.1. In this case since the series converges, the term-by-term 

derivative and integral series also converge. (See Exercise 12.5.5.) 

When \x\> p> r 



\kak X 



k-l I 



> Ofc P 



and 



for sufficiently large k because then \x/p\ > 1 so 



oo and 



X 


k 


1 


p 




X 



X 


X 


k + l 


p 



k + l 



oo 



> Ofc p 



as k ^ oo 



See Exercise 12.5.2. In this case since the series P^ diverges, the term-by-term 

derivative and integral series also diverge. (See Exercise 12.5.5.) 

The fact that the integral of the series equals the series of integrals now follows from 
Theorem 12.5 applied to and interval [—p,p] with p <r, the radius of convergence. 

To prove the derivative part, define a new function 



OO 

T[x] = ^^kak Xk-i 

k^l 

on the interval of convergence, (— r, r). T[x\ is continuous by Theorem 12.5. The integral 




OO ^ 

^ ^ (Zfc / k ^k— 1 ^ ^ ^ 

k=l -^0 fc=l 



-S'N 



The second half of the Fundamental Theorem 9.2 says 




This proves that the derivative of the series is the series of derivatives. 



[ Exercise set 12.^ 

1. Show that if 0 < p < I then limfe^oo k p^ = Q 
1 . Show that if p > 1 then linifc^oo k ~ ^ 

3. Prove: 

Theorem 12.8. 

If the series oq + ai + 02 + + . . . converges (with terms of arbitrary sign), 

then limfc^oo ak = 0- 



4. Give a divergent series ao + ai + ^2 + 03 + • ■ • of positive terms with limfe^oo Ofc = 0. 
(HINT: Harmonic series.) 
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5. Prove the following. 




6 . Euler’s Criterion for Convergence 

Show that the series converges if and only if whenever m and n are both infinite 

hyperintegers, 

n 

gfc - 0 

k—m 

7. Prove: 

Theorem 12.10. Limit Comparison 

Suppose two sequences au and bk satisfy limfc^oo ff" = ^ 0. Then 

converges if and only ifJ2T=i converges. 

HINT: If k is infinite, Ofe = (L + Sk)bk with e « 0. How much is ~ ^ ^ 







The Theory of Fourier Series 

This chapter gives some examples of Fourier series and a basic 
convergence theorem. 



Fourier series and general “orthogonal function expansions” are important in the study 
of heat flow and wave propagation as well as in pure mathematics. The reason that these 
series are important is that sines and cosines satisfy the ‘heat equation’ or ‘wave equation’ or 
‘Laplace’s equation’ for certain geometries of the domain. A general solution of these partial 
differential equations can sometimes be approximated by a series of the simple solutions by 
using superposition. We conclude the background on series with this topic, because Fourier 
series provide many interesting examples of delicately converging series where we still have 
a simple general result on convergence. 

The project on Fourier series shows you how to compute some of your own examples. 
The method of computing Fourier series is quite different from the methods of computing 
power series. 



The Fourier sine-cosine series associated with f[x] for —it < x < tt is: 

OO 

f[x] ~ Oo -k E [ofc Cos[/cx] -I- bk Sin[/cx]] 

fc=i 



where 



and for k = 1, 2, 3, • • • 



1 r 

ao = — J f[x] dx = Average of f[x] 



I r I r 

ttk = — f[x] • Cos[fcx] dx and bk = — f[x] ■ Sin[fcx] dx 



Dirichlet’s Theorem 13.4 says that if f[x] and f'[x] are 27r-periodic and continuous except 
for a finite number of jumps or kinks and if the value f[xj] is the midpoint of the jump if 
there is one at Xj, then the Fourier series converges to the function at each point. It may 
not converge uniformly, in fact, the approximating graphs may not converge to the graph 
of the function, as shown in Gibb’s goalposts below. If the periodic function f[x] has no 
jumps (but may have a finite number of kinks, or jumps in f'[x]), then the series converges 
uniformly to f[x]. 

Convergence of Fourier series is typically weaker than the convergence of power series, as 
we shall see in the examples, but the weak convergence is still quite useful. Actually, the 
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most important kind of convergence for Fourier series is “mean square convergence,” 



(f[x] - dx^O 



where Sn[x] is the sum of n terms. This is only a kind of average convergence, since the 
integral is still small if the difference is big only on a small set of x's. We won’t go into 
mean square convergence except to mention that it sometimes corresponds to the difference 
in energy between the ‘wave’ f[x] and its approximation Mean square convergence 

has important connections to “Hilbert spaces.” 

Convergence of Fourier series at ‘almost every’ point was a notorious problem in mathe- 
matics, with many famous mathematicians making errors about the convergence. Fourier’s 
work was in the early 1800’s and not until 1966 did L. Carleson prove that the Fourier series 
of any continuous function f[x] converges to the function at almost every point. (Dirichlet’s 
Theorem ?? uses continuity of f'[x] which may not be true if f[x] is only continuous. Mean 
square convergence is much easier to work with, and was well understood much earlier.) 



13.1 Computation of Fourier Series 



This section has some examples of specific Fourier series. 



Three basic examples of Fourier sine - cosine series are animated in the computer program 
FourierSeries. These follow along with some more. “Calculus” is about calculating. The 
following examples indicate the many specific results that we can obtain by performing 
algebra and calculus on Fourier series. Of course, the computation of the basic coefficients 
also requires calculus. 
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The series 

. . 7T 4 /Cos[x] Cos[3d Cos[5a:] Cos[(2n + 1) x] 

converges to the function that equals \x\ for — tt < x < tt and is then repeated periodically. 
The average value of f[x] is clearly tt/ 2 and can be computed as the integral 



““ “ S * = 2^ /_. 



dx 



X dx 



1 

“ TT 2 “ 27T 

7T 

^ 2 



Notice the step in the computation of the integral where we get rid of the absolute value. 
We must do this in order to apply the Fundamental Theorem of Integral Calculus. Absolute 
value does not have an antiderivative. We do the same thing in the computation of the 
other coefficients. 

1 

02fe = — 

7T 



|x| Cos[2fca;] dx 

2 

= — X Cos[2fca;] dx 
^ do 



2k 



Sin[2/ca 



2k 



2fc7T Sin[2/c7r] — 0 — 



(2fc)- 



Sin[2A:a:] dx 
(Cos[2fcx] IJ) 



= 0 



using integration by parts with 
u = X 
du = dx 



dv = Cos[2fca;] dx 



a; = — Sin[2fcx] 
z/c 



In the fourier Series project you show that the Ofc terms of the Fourier series for f[x] = |a;| 
with odd k are 

4 I 



02fc+l — 



7T (2k + I)^ 



and all bk = 0- 



Example 13.2. A Particular Case of the \x\ Series 
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Set X = 0 in the series 

7T 4 /Cos[x] Cos[3x] Cos[5x] 
|x| = I - 



7T 



1 



32 



n 4 / 1 1 

7t2 1 1 



52 

1 



(2n+ 1)2 



Cos[(2n + 1) x] 
(2n+ 1)2 



Example 13.3. Fourier Series for f[x] = \ Sin[a 



for — 7T < X < 7T 



2 4 

|Sin[x]| = z~zJ2 



7T 7T ^ ' 4fc2 — 1 

fe=l 



Cos[2fca 



In the project on Fourier Series you show that 

2 4 /Cos[2x] Cos[4x] Cos[6x] 

|Sin[x]| = --- ^ J ' 

7T 7T y 6 



Cos[2n x] 



15 35 (2n)2-l 

converges to the function that equals |x| for — tt < x < tt and is then repeated periodically. 



Example 13.4. A Particular Case of the |Sin[x]| Series 



Set X = 0 in the series 



I Sin [x] I 



2 4 / Cos[2x] Cos[4x] Cos[6x] 



15 



35 



2 4 

0 = , „ 

7T 7T V 3 



1 



1 

15 



1 



(2n)2 - 1 



1 

15 



1 

35 



1 



(2n)2 - 1 



Cos[2r 



(2n)2 - 1 



Example 13.5. Another Case of the |Sin[x]| Series 



Set X = 7t/2 in the series 

2 4 /Cos[2x] Cos[4x] Cos[6x] 



^ _ 2 4 /-I 1 -1 

7T _ 1 1 1 1 

4“2”^3~T5'^^'^"'^ 
TT _ 1 1 1 

8 “ 3 



(- 1 )^ 



(2n)2 - 1 
(_l)n+l 



(2n)2 - 1 



35 



4(2fc+ 1)2 - 1 



Cos[2r 



(2n)2 - 1 



Example 13.6. Fourier Series for f[x] = x^ , for —tt < x < tt 











Computation of Fourier Series 



163 



— Cos[fcx| 



fc=l 



In the project on Fourier Series you show that 






22 



32 



^ ^ ^^„+i Cos[nx] 

n 2 



for — 7T < X < TT. 

Example 13.7. A Particular Case of the x"^ Series 
Set X = 0 in the series 

2 7t2 , r t Cos[2x] Cos[3x] , ,,„,,Cos[nx] \ 

= y - 4 (^Cos[x] - + • • • + (-1)"+!—^ + . . . j 



Example 13.8. The Formal Derivative of the Series 



Notice that if we differentiate both sides and (without justification) interchange derivative 
and (infinite) sum, we obtain 



dx2 

dx 



d ( A ( \ 1 Cos[2x] 

_ __4 CosW-y-! 



Cos|3i| , , , ,,„+iC2!M 



32 +••• + (-!)' „2 



2 X = 4 I Sin [x] — 



Sin[2x] Sin[3x] 



+ ••• + (- 1 ) 



n+l 



Sin[nx] 



Example 13.9. Fourier Series for the Sawtooth Wave f[x] = x 
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{terms =, 7} 




x = 2 

k=l 



(_l)fe+l 

k 



Sin[fc x] 



= 2 ( Sin[a;] — 



Sin[2x] Sin[3a:] 



+ ••• + (- 1 ) 



n+1 



Sin[r; 



Without the absolute value the integrals of the Fourier coefficients can be computed 
directly, without breaking them into pieces. The Fourier sine-cosine series for the “sawtooth 
wave,” 



/N 



X, for — TT < X < TT 
0, for \x\ = 7T 



extended to be 27 t periodic is easier to compute. Notice that the average uq = 0, by 
inspection of the graph or by computation of an integral. Moreover, ccCos[2fca;] is an odd 
function, that is, — ccCos[2fc • (— x)] = — (x Cos[2fcx]), so the up areas and down areas of the 
integral cancel, Ofe = 0. Finally, you can show that 



bk 



k 



Example 13.10. A Particular Case of the x Series 



Set X = 7 t/ 2 in the series 



= 2 ( Sin[x] — 

1 
5 



^ = l-i 

4 3 



Sin[2x] Sin[3x] 



(-ir 



+ ••• + (- 1 ) 



n+1 



Sin[r 



2n -I- 1 



Example 13.11. Fourier Series for the Square Wave f[x] = Sign[x] 
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In the Fourier Series project you show that 

1 r 

ttk = — Sign[x] Cos[fcx] dx = 0 

and 

b2k = 0 

because each piece of the integral fj Sin[2fcx] dx = 0, being the integral over whole periods 
of the sine function. 
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Also, 



2 

^ 2 fe+i = — Sin[(2/c + l)a;] dx 

^ Jo 



>0 

2 1 



f.(2fc+l)-7r 



7T 2/c + 1 Jg 
4 1 



Sin[M] du 



7T 2k + 1 



Example 13.12. A Particular Case of the Signfx] Series 



Substituting x = tt/ 2 in the series 



Sign[x] = — ( Sin[x] + ^ Sin[3x] + ^ Sin[5x] + 
7T y 3 5 

7T ^ 1 1 (-1)" 

4 3 5 2n+l 



Example 13.13. The Series for Cos[wx], for Non-Integer 



OJ 



In the project on Fourier Series you show that when ui is not an integer, 



2uj Sin[w7r] / 1 Cos[x] Cos[2x] (—1)” r ^ 

Cos wx = ^ + • • • + , Cos nx 

7T \2uj^ — 1 — 2^ 



Substituting x = tt and performing some algebra, we obtain. 



7T \2u;^ — 1 — 2^ J 



CosIwtt] „ / 1 1 

^ — f = 2 w ( 7^ ”1 

Sm[a;7rJ \2uj^ w 



1 



CosIwtt] 1 
7TTT-4 1 =2uj 



Sin[i 



OJTTl OJ 



2-1 w2 _ 22 

1 1 



CosIwtt] 1 

TT / TTT-; ; du) = 



w2 - 1 w2 - 22 
2oj 



Log 



0 Sin[o;7r] tt u> 
Sin[7T X 



duj ■ 



>0 



TTX 



= Log[l-^]+Log[l-|J] 



2u! 



W2 - 22 



du) 



Log[l--] 



2uj 



Iq w 2 — n2 



duj - 



Sin[7Tx] = TTxJI 



fc=l 



Example 13.14. The Derivative of The Series for Cos[i 



U!X\ 
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Formally differentiating, 

dCos[wa:] 2u Sin[w7r] d / 1 Cos[cc] Cos[2x] 

dx IT dx \2a;2 — 1 — 2^ 

2 Sin[w7r] / Sin[a;] 2Cos[2a;] 1 ^n+l ^ o- r i 

- Sin wx = ^ + • • • + ^ Sin nx 



f-D” 

j Cos[nx] 



Example 13.15. The Series for Sin[wx], for Non-Integer ui 



In the project on Fourier Series you show that when ui is not an integer, 

Si„|™| = IMfi! (-^ + ifdM + ... + (_!)" Si„i„x] 

7T V — 1 — 2^ uj^ — 



Example 13.16. Hyperbolic Functions Restricted to —tt < x < tt 



We can also compute 



, , , 2u; sinhfwTrl / 1 Coslxl Cos[2xl , 

cosh[u;x] = + + + 



1 



12 



sinh[o;x] = 



2sinh[a;7r] / Sin[x] 2 Sin [2 x] 



12 



22 



22 

+ ••• + (- 1 ) 



U;2 J7.2 



Cos[n x] 



n+1 



u;2 + n2 



Sin[?i 



[ Exercise set 13. — 

1. Use the computer to plot the Fourier series examples above. 



13.2 Convergence for Piecewise Smooth 
Functions 

Fourier series of piecewise smooth functions converge. 



A function f[x] on — tt < x < tt is said to be piecewise continuous if it is continuous 
except for at most a finite number of jump discontinuities. That is, except for finitely many 
values of x, lim^^a, f[£] = f[x], and at the finite number of other points Xj, f[xj] has a jump 
discontinuity, meaning f[x] exists and f[x] exists. {f[xj] can exist, but need 

not equal either one-sided limit. Fourier series will converge to the midpoint of a jump.) A 
function is said to be piecewise smooth if both f[x] and f'[x] are piecewise continuous. 

Piecewise smooth functions can be continuous, like the periodic extension of |x| for — tt < 
X < TT. In this case the function has a kink or its derivative f'[x] has a jump discontinuity 
because f'[x] = — 1 for x < 0 and f'[x] = -1-1 for x > 0. (See Figure 13.1.) 
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Periodic extension can create jump discontinuities, like x for — tt < x < tt. In this case 
lima;|,r f[x] = TT but lima;j^,r “periodic x” = lima;j^_,r x = —tt. (See Figure 13.2.) 

Periodic extension can create jumps in the derivative of a smooth function like x^ . Sketch 
the graph from —tt to tt, extend periodically, and observe that the graph is continuous, but 
not smooth at multiples of tt. 

We need the following 



Theorem 13.1. A Trig Identity and Integral 



and 



Sin[(n + i) (j)] 
2 Sin[i^] 



^ + '^Cos[k(p] 



1 f'^ Sin[(n + i) 4>] 
nJo Sin[i<^] 



d(p = 1 



Proof; 

By Euler’s formula, Cos[0] = + e“'®). 



n . n 2n 

^Cos[fc^] = - forr = e'^ 






k——r 



h^O 






1 _ gi(2n+l) I 



= e 



1 — r 

_ gi (n+l) 0 g“* 5 ' 

2 



1 - e‘ 
1 

1 - e'' 



gi(n+i)0 _ g-i(n+i)0 ^ 2i 

2i 2 

Sin[(n + ^)4>] 

2 Sin[i (j)] 



using Euler’s formula, Sin[0] = A ^gi® _ g 

For the integral notice that all the terms of the integral of the right hand side vanish 
except for the constant term. 



Example 13.17. The Dirichlet Kernel Sn[x] = ^ f[x + 9] 



We use this identity to write a partial sum of a Fourier series as an integral. By the 
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definition of a partial sum of the Fourier series, for any n, 



Sn[x] = ao + Cos[A: x] + bk Sin[A: x]) 

k^l 

= — / /[^] f i (Cos[A:^] Cos[fca;] + Sin[/c^] Sin[fcx]) I 



TT . 
1 

TT . 



/K] • Ti by the trig identity above. 

2 Sin[^ (5 - x)] 



Now make a change of variable and differential, ^ = x + 9, = d9 and use periodicity 

of f[x] to see. 




Notice that this integrand is well-behaved even near 0 = 0 where the denominator tends 
to zero. (Sin[x]/x extends smoothly to cc = 0, as you can easily see from the power series 
for sine.) 

The intuitive idea using this formula in the convergence theorem given next is to think 
of the sine fraction as giving a “measure” for each n. 





Sin[(n -I- i)0] 
Sin[i0] 



d9 



Each of these measures has total “mass” 1, 




( 1 ) dfj,n[9] 



J_ r Sin[(n -h ^)9] 
2 7r7_^ Sin[i0] 



d0= 1 



As n increases, more and more of this unit mass is concentrated near 0 = 0, 
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Since 0 « 0 implies f[x — 6] fn f[x], we expect 



if[x + e]) dunlO] « f[x] 



when n is large. It is not quite this simple, but it is easy to show that the measure is less 
and less important away from zero. 

Theorem 13.2. Suppose f[x] is 2 TT-periodic and piecewise smooth. Then for any fixed 
e > 0, 



J e ^ ^[2 J —7T 



Sin[(n + i) 
Sin[i 9] 



d9 = Q 



Proof; 

Define the function 

aW\ = /[a; + ^']/Sin[6»/2] 

Since f[x] is piecewise smooth and Sin [0/2] is nonzero for £ < 0 < tt, 

i" de = g[9] Sin[(n + 1)0] 00 

^[0] is piecewise smooth and we may integrate by parts: 



g[9] Sin[(n + -)0] d9 = 



Cos[(n + 1)£] - g[Tr] Cos[(n + 1)7 t] 

+ J g'[0] Cos[(n+ 1)0] 00^ ^0, as n ^ 00 



In order to prove the strongest form of the convergence theorem, we need the following 
generalization of this result. 

Theorem 13.3. Suppose that g[9\ is piecewise continuous on the subinterval [a, b\ of 
[— 7r,7r]. Then 



f g[9] Sin[j^0] d9 ^ 0 as ir ^ 00 
J a 



Proof; 

Let be a large integer, the interval a < x < b \s divided into a sequence of adjacent 
subintervals [x 2 j,X 2 j+i]} [x 2 j+i,X 2 j+ 2 ] of length tt/v where v X 2 j is an even multiple of tt 
and V X 2 j+i is an odd multiple of tt. These are simply the points that lie in the interval of 
the form kir/v, for integers k. 

There may be as many as 2 exceptional unequal length subintervals at the ends and 
one additional non-matched subinterval of odd and even multiples of tt/i/. Re-number the 
sequence beginning with x\ = a and ending with b. 

The integral 



PD + l r^2j + 2 

/ (/[x] Sin[jza:] dx = 2 ^ / 0 N Sin[jza:] dx+ g[x] Sin[jza:] 

Ja \dx2j dx2j+l 
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Sine is positive on one subinterval of each pair and negative on the next with 



fX2j+l f^2j + 2 

/ Sin[t/^] = — Sin[:^^] 

J X2j Jx2j+1 

The same decomposition is true when v is an infinite hyperreal. We may write 

pb pxi 

/ g[a;] Sin[i^ a;] dx= g[x\ Sin[i^a;] dx + 

J a J a 

/ r^2j rX2j+i \ 

y / g[x\ Sin[x] dx+ g[a;] Sin[a;] dx 

j=l \dx2j-l Jx2j J 

fb 

+ g[x] Sin[i/a;] dx 

J XO„ A.O 



'X2n+2 

When V is infinite, g[0\ changes very little on [x 2 j-i,X 2 j+i], in fact, the maximum and 
minimum over a pair of subintervals differ by an infinitesimal: there is a sequence of in- 
finitesimals gj such that 

- 5[C]I < m « 0, for X 2 j-i <^<C< X 2 J +1 

Sine is positive on one and negative on the other so that the adjacent subintegrals nearly 
cancel. 



'X2j 

rx2j 

'X2j 
rX2j 

'x2j-l 



pX2j rX2j + l 

/ g[x] Sin[i^a;] dx+ g[x] Sin[j/a;] dx 

J X2j — 1 ^ ^2 j 

r^2j ^ 

/ g[x] Sin[i/ a;] -I- H — ] Sin a; -I- tt] dx 
Jx2j-1 ^ 

7T 

{g[x\ — g[x H — ]) Sin[i^a;] dx 

7T 

Max[| 5 '[x] — g[x H — ]| : X 2 j-i < x < X 2 j] / 1 dx < gj 

b' dx2j-l 



and 



pX 2 j rX2j + l , 

y ( / g[x] Sin[j/a;] dx+ g[x\ Sin[j/a:] dx\ < y ^ gj 

i = l ^dx2j-l Jx2j 



< Max[gj : 1 < j < n] — 
i=i ^ 

<gh-{b- a)l2 « 0 

The few stray infinitesimal end subintervals contribute at most an infinitesimal since g[x\ is 
bounded, so 

j-b 

/ g[x] Sin[j/a;] dx « 0 



and the theorem is proved (by the characterization of limits with infinite indices.) 
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Theorem 13.4. Dirichlet’s Convergence Theorem 

If f[x] is a piecewise smooth 2r:-periodic function, then the Fourier series of f[x] 
converges to the function at each point of continuity of f[x] and converges to the 
midpoint of the jump at the finite number of jump discontinuities, for all x, 

-j / \ oo oo 

- ( lim /[^] + lim /[^] ) = oq + Cfc Cos[A: x] + y^bk Sin[fc a;] 

If f[x] is continuous at x, f[x] = I /[$] + /[^]). 

Proof; 

Fix a value of x and let = lini^j.a; /[C]- Then lim^j^a; f[x] — F^ = 0 and the piecewise 
derivative of f[x] means limAz;j.o exists. 

Since limAa^^o = limAx^o = 1, we also have 



f[x + /\x] - F,, ^ 
A™ 2Sin[Ax/2] 

This discussion means that the function 

f[x + 6»] - Fir 



g[d] = 



Sin[6»/2] 

is piecewise continuous on [0, tt] and Theorem 13.3 says 



f g[9] Sin[(n + ^ 0 as n ^ oo 

Jo 2 



Thus, we see that 

- [ ^ Sin[(n + i)6>] de^O 

Trio Sin[0/2] 2^' 



- r fix +0] 

Jo 



Sm[(.,+ i)9] 



Sin[6»/2] 



1 Sm[{n + 1)9] 
TT io Sin[6»/2] 



d9 = 



/[C] 



Similarly, 




This proves the theorem. 



Sin[(n+ h)6l] lirngi^ /K] 

Sin[0/2] ^ 2 



[ Exercise set 13.^ 

Intuitively, many of the weakly convergent Fourier series are converging by cancelling 
oscillations. If this is true, we would expect averages to be even better approximations. 

1. Let Sm\x] = 5 + X^fcLi (Cos[A:a;] + Sin[/ca;]) be the partial Fourier-like sum. Define the 
average of the partial sums to be 

1 ” 

a«W = Ty — SmN 

1 + n ^ 

m—0 
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When = ao + Cos[fca;] + bk Sin[fca;]) are the partial Fourier sums of a 

function f[x], let 

1 ” 

AJx\ = SjT,\x\ 



denote the average of the first n Fourier sums. 

(a) Plot the average Fourier sums An[x] for the examples of the previous section, 
especially those that converge weakly like f[x] = x at ±7r or f[x] = Sign[x]. 

(b) Show that 



1 /Sin[(n+l)|] 



(c) Show that 



' l + n\ Sin[f] 
^ r 1 / Sin[(n + l)f] \ 

Jo 1 + V Sin[f ] J 



dx = 1 



(d) Show that the average of the first n Fourier series of a function f[x] are given by 

^ L rb ) "" 



13.3 Uniform Convergence for Con- 
tinuous Piecewise Smooth Func- 
tions 

Fourier series of continuous piecewise smooth functions converge uniformly. 



Theorem 13.5. Uniform Convergence of Fourier Series 

If f[x] is continuous and f'[x] is piecewise continuous, then its Fourier series 
converges absolutely and uniformly to the function. Moreover, the Fourier series 
of any piecewise smooth function converges uniformly to the function on any closed 
subinterval where the function is continuous. 

Proof of this theorem requires some inequalities related to mean square convergence. In 
particular, 

n ^ ^ \ ^ ^ ^ 

Oq -I- + b\) < - / (/[x])^ dx and for any sequences I ^ Ufc Ufc j < 

fe=i Vfc=i / fe=i fc=i 



We refer the reader to a book on fourier series. 
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We do want to compare the convergence of some of the continuous and discontinuous 
examples computed above to compare uniform and non-uniform convergence. 

Example 13.18. Infinitely Slowly Convergent Series with a Discontinuous Limit 

Fourier series can converge delicately. For example, the identity 



X = 2 I Sin[x] — 



Sin[2x] Sin [3a;] 



+ ••• + (- 1 ) 



n-nSin[na;] 



n 



is a valid convergent series for — tt < x < tt. However, the Weierstrass majorization does 
not yield a simple convergence estimate, because 



^ ^^„_n Sin[nx] 
n 



1 

< - 



n 



is a useless upper estimate by a divergent series, n ~ Fourier series converges 

but not uniformly, and its limit function is discontinuous because repeating x periodically 
produces a jump at tt as follows: 



{terms =, 7} 




The convergence of the Fourier series for Sign[x] 



4 

7T 



Sin [a;] -I- - Sin [3 a;] -I- - Sin [5 a;] -I- 

O 0 



Sign [a;] 



holds at every fixed point, but the convergence is not uniform. 
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In fact, the graphs of the approximations do not converge to a “square wave,” but rather 
to “goal posts.” Each approximating term has an overshoot before the jump in Sign[a;] and 
these move to a straight line segment longer than the distance between ±1. You can see this 
for yourself in the animation of the computer program Fourier Series. A book on Fourier 
series will show you how to estimate the height of the overshoot. 

In both of these examples, no matter how many terms we take in the Fourier series, even 
a hyperreal infinite number, there will always be an x close to the jump where the partial 
sum is half way between the one-sided limit and the midpoint of the jump. In this sense 
the series is converging “infinitely slowly” near the jump. 

13.4 Integration of Fourier Series 



Fourier series of piecewise smooth functions can be integrated termwise, 
even if the series are not uniformly convergent. 



Theorem 13.6. Integration of Fourier Series 

Let f[x] be a piecewise continuous 2TT-periodic function with Fourier series 

OO OO 

ao + E Ofc Cos [A: x] + bk Sin [A: a;] 

fe=0 k=0 



(which we do not even assume is convergent.) The Fourier series can be integrated 
between any two limits — tt < a < ^ < tt and 



/ f[x] dx = oo(C — a) + / flfe Cos[A:a:] dx F / bk Sin[fca;] dx 



Moreover, the series on the right converges uniformly in 



Proof; 
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Define the function 



= [ if id - ao) 
J — 7T 






Then F[x] is continuous inside the interval (— 7r,7r) and piecewise smooth. Since oq = 
F[ — 7t] = 0 = T"[7t], and F[x\ has a continuous periodic extension. Applying 
Theorem 13.5, the Fourier series for F[x] converges uniformly. Denote this series by 

OO 

F[x] = Ao + ^ {Ak Cos[A:a;] + Bk Sin[fca;]) 

fe=i 

Apply integration by parts to the definitions of the Fourier coefficients with k > 0, 

1 



Ah = 



F[a;] Cos [/ex] dx 



1 



= - F[n 



, Sin[/e ^ 



- F[-tt] 



Sin[— / ctt] 1 



f[x] Sin [/ex] dx 






and similarly 



— 7 

k 



Notice that the uniformly convergent series gives 

OO 

F[x] — F[^] = ^ {Ak{Cos[k x] — Cos[A:^]) + Bk{Sm[kx] — Sin[A:^])) 



OO 

= E 



^(Sin[/ex] — Sin[/e^]) — ^(Cos[/ex] — Cos[/e^]) 

fx fZ 



Replace F[x] by its definition and the differences by integrals, 

— ^(Cos[/ex]— Cos[/eC) = [ Sin[/eCdC ^(Sin[/e x] — Sin[/e ^]) = [ Cos[/eC dC 

k k 

to see the uniformly convergent series 

pX pX QQ / pX pX 

/ f[(\dC,-ao dC = '^[ak Cos[kQdC + bk Sin[/eC] dC 

Ji V y 




